COMPLEXITY  AS  A  FACTOR 

OF  QUALITY  AND  COST 
IN  LARGE  SCALE  SOFTWARE 
DEVELOPMENT 


Joe  Newton  Harris 


Thesis 
H2895 


SUWi 


NAVAL  POSTGRADUATE  SCHOOL 

Monterey,  California 


THESIS 


COMPLEXITY  AS  A  FACTOR  OF  QUALITY  AND  COST 

IN 
LARGE  SCALE  SOFTWARE  DEVELOPMENT 

by 

Joe  Newton  Harris 


December     1979 


Thesis   Advisor: 


N.    F.    Schneidewind 


Approved  for  pi±)lic  release;  distribution  imlimited. 


Tl9055i, 


UNCLASSIFIED 


SECUMtTV  CLASSIFICATION  Of  THIS  PAGE  (1*hM  Of  Bntmn^) 


REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 

\.    KCPORT  NUMBER 

2.  GOVT  ACCESSION  NO. 

3.     RECIPIENT'S  CATALOG  NUM8EB 

4.    TITLE  r«nrfSu6«/fl*J 

Complexity  as   a  Factor  of  Quality   and 
Cost  in  Large   Scale  Software  Development 

5.    TYPE  OF  REPORT  «  PERIOD  COVERED 

Master's  Thesis; 

December    1979 

t.    PERFORMING  ORG.  REPORT  NUMBER 

7.     AUTMORf»J 

Joe  Newton  Harris 

•  ■    CONTRACT  OR  GRANT  NUMBERr*; 

9.    PERFONMINS  ORQANIZATION  NAME  AND  ADDRESS 

Naval  Postgraduate  School 
Monterey,    California      93940 

10.    PROGRAM  ELEMENT,  PROJECT.  TASK 
AREA  A  WORK  UNIT  NUMBERS 

It.    CONTROLLING  OFFICE  NAME  AND  ADDRESS 

Naval  Postgraduate  School 
Monterey,    California      93940 

12.    REPORT  DATE 

December   1979 

13-     NUMBER  OF  PAGES 
97 

14.    MONITORING  AGENCY  NAME  ft  AODRESSflf  Mltmnnl  fran  CanltalUnt  OIHem) 

Naval   Postgraduate   School 
Monterey,    California      93940 

IS.    SECURITY  CLASS,  (ol  thia  tiport) 

Unclassified 

ISa.    DECLASSIFICATION/ DOWN  GRADING 
SCHEDULE 

16.    DISTRIBUTION  STATEMENT  fo/  lhl»  Rapett) 

Approved   for   public   release;    distribution  unlimited. 

17.    DISTRIBUTION  STATEMENT  (ol  thu  abalrmet  MiMracT  In  Blaek  30,  II  dlllmrmH  from  RaperO 

IS.    SUPPLEMENTARY  NOTES 

19.    KEY  WORDS  (ConlliUM  on  r«mM  aid*  II  nmcaamtr  and  Idantlly  hy  block  numbmt) 

Software  Complexity,  Software  Quality,  Software  Cost  Estimation 

20.    ABSTRACT  (Contlnua  an  ravaraa  alda  II  naaaaaaar  f^  Idantltr  br  Maak  numb—i 

The   impact  of  complexity  on   software  quality  and  costs    is 
examined.      Historic  and   current   issues   relating   to  complexity   in 
the   software   development  and   software   cost  estimation  processes 
are   reviewed.      Select  complexity  models   and  metrics   are   described 
and  briefly  analyzed.      Finally,    an  argument   is   presented   in   support 

,. 

DO  ,: 


FORM 
AN  71 


1473  EDITION  OF   t  NOV  ••  IS  OBSOLETE 

S/N   0103-014<6601  | 


UNCLASSIFIED 


lECURITV  CLASSIFICATION  OF  THIS  PAGE  (Whan  Data  Kniatad) 


^ 


UNCLASSIFIED 


of  McCabe's  Directed  Graph  Model  as  a  useful  software  management 
tool  in  controlling  complexity,  formulating  a  test  strategy  and 
allocating  resources. 


DD     Forixi.     1473 

,  1  Jan  73  _ 

S/N    0102-014-6601  tieuoiTv  ctAMirtCATion  pr  twi«  ^*ofrw»»««  o»»«  t"i«»*) 


1  Jan  73  2        TmrT.A.g.gTFTF.n 

'N    0102-f    " 


Approved  for  public  release;  distribution  unlimited, 

Complexity  as  a  Factor  of  Quality  and  Cost 

in 
Large  Scale  Software  Development 


by 


Joe  Newton, Harris 

Lieutenant  Commander,  United  States  Navy 

B.A.,  Duke  University,  1968 

MBA,  National  University,  1975 


Submitted  in  partial  fulfillment  of  the 
requirements  for  the  degree  of 


M-ASTER  OF  SCIENCE  IN  .MANAGEMENT 

from  the 

NAVAL  POSTGRADUATE  SCHOOL 

December  1979 


DUDIXY  KNOX  UECfftY 
NAVAl  POStGRADUATE  SCHOOl 
MONIERtr  CA  S.i540 


ABSTRACT 

The  impact  of  complexity  on  software  quality  and  costs 
is  examined.   Historic  and  current  issues  relating  to  com- 
plexity in  the  software  development  and  software  cost  esti- 
mation processes  are  reviewed.   Select  complexity  models 
and  metrics  are  described  and  briefly  analyzed.   Finally, 
an  argument  is  presented  in  support  of  McCabe's  Directed 
Graph  Model  as  a  useful  software  management  tool  in  control- 
ling complexity,  formulating  a  test  strategy  and  allocating 
resources. 
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I .   INTRODUCTION 

Total  procurement  costs  of  large  scale  computer  systems 
are  conventionally  divided  into  costs  associated  with  pro- 
duction of  hardware  components  (computers  and  peripheral 
equipment)  and  costs  associated  v/ith  software  development 
(program  design,  coding,  test,  maintenance  and  documenta- 
tion) .   V'^hile  hardware  costs  dominated  in  early  computer 
models,  the  combined  effects  of  improved  cost-reducing  tech- 
nology in  the  production  of  hardware  and  marked  rises  in  the 
costs  of  labor  to  develop  programs  have  resulted  in  a  dra- 
matic reversal  in  this  situation  today,  [l,  2]   Indeed,  if 
these  trends  continue,  software  costs  will  converge  to  approxi- 
mately 90%  of  total  computer  procurement  costs  in  the  mid 
1980 's  (see  Figure  1).   [3] 


FIGURE  1  HARDWARE/SOFTWARE  COST  TRENDS 
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From  a  management  perspective,  the  steady  incremental 
reduction  in  hardv/are  unit  costs  has  been  a  concrete  mani- 
festation of  technological  gains  and  increased  productivity. 
Although  management  problems  have  arisen,  they  have  been  the 
livable  difficulties  of  a  high  growth  industry  meeting  each 
new  challenge  with  multiple  technical  solutions,  thereby 
constraining  managers  only  by  their  abilities  to  adapt  to 
and  harness  new  opportunities. 

Conversely,  the  rising  costs  of  softv/are  are  directly 
indicative  of  a  critical  mismatch  between  complex  needs  and 
limited  technical  abilities.   Trends  and  pressures  leading 
to  this  situation  have  existed  from  the  beginning  attempts 
to  apply  general  purpose  computers  to  progressively  more 
complex  and  larger  problems  of  society.   As  awareness  of  a 
large  number  of  system  development  failures  and  near  failures 
increased  in  the  recent  past  (illustrated  by  cost  overruns, 
schedule  slippages  and  performance  degradations) ,  a  growing 
appreciation  of  the  scope  of  unsolved  technical  and  produc- 
tivity problems  began  to  emerge. 

This  awareness  was  well  summarized  at  the  1973  "Symposium 
on  the  High  Cost  of  Software"  in  a  statement  that  continues 
to  apply  today:  "Progress  in  software  technology  has  been 
very  slow,  but  demands  for  software  production  are  increasing 
in  volume  and  complexity.   Such  demands  have  clearly  out- 
stripped the  technology,  with  very  costly  results.   Produc- 
tion of  new  software  products  suffers  great  overruns  in  cost 
and  delivery  time,  and  quality  is  often  deficient  in 
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correctness,  modif lability  and  transferability.   The  mainte- 
nance costs  of  old  software  products  may  be  an  order  of 
magnitude  larger  than  production  costs,  due  to  poor  original 
design  and  production."  [4] 

In  order  to  close  the  gap  between  existing  software 
technology  and  production  demands,  a  number  of  noteworthy 
programming/management  techniques  have  been  developed  and 
implemented  with  varying  success.   These  developments  include 
computer  aided  specification  generation,  top-down  design, 
structured  programming,  chief  programmer  team,  egoless  pro- 
gramming and  program  walkthrus. 

Also  imbedded  in  the  historic  problems  of  developing 
large  scale  software  has  been  an  inability  to  produce  accurate 
project  cost  and  schedule  estimates  and  a  corresponding  mana- 
gerial failing  to  correctly  assess  risk  and  critically  evalu- 
ate estimates  and  associated  underlying  assumptions  presented 
by  subordinate  software  estimating  groups.   The  cumulative 
project  costs  of  developing  and  maintaining  large  scale  sys- 
tem software  are  determined  by  a  myriad  of  interrelated  vari- 
ables including  the  quality  and  stability  of  original  design 
specifications,  the  relative  difficulty  of  the  technical 
problems  involved,  the  productivity  of  the  programming  group 
available,  and  the  traditional  project  management  skills  of 
efficient  resource  direction  and  utilization.   The  relative 
distribution  of  available  resources  over  production  phases 
varies  with  each  project.   However,  studies  have  indicated 
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the  average  %   resource  requirement  distributions   suiranarized 
in  Table   I.      [  2] 

TABLE   I 

Percentage  Distribution  of  Resource  Utilization 

DEVELOPMENT  PHASE*     ANALYSIS  &      PROGRAM      TEST  & 

DESIGN  WRITING      INTEGRATION 

PROJECT  TYPE 

I  Military  Command  &        15  15  50 

Control  System 

Space  Oriented  System     35  20  45 

I  IBM  360  Operating         35  15  50 

"    System 

*NOTE:   This  table  ignores  maintenance  expenses  incurred 
after  system  deployment. 

The  occurrence  of  the  proportionately  high  cost  factor  in  the 
test  and  integration  phase  as  indicated  in  this  summary  has 

I  come  as  an  unpleasant  surprise  to  many  project  managers  and 
to  those  supplying  project  funds.   The  chronic  underestimating 

I  of  these  costs  is  most  directly  attributable  to  a  pervasive 
lack  of  appreciation  for  the  extent  of  required  managerial 
involvement  and  severity  of  potential  pitfalls  associated 
with  the  iterative  process  of  software  quality  assurance. 
When  a  manager  underestimates  the  dollar  and  time  requirements 
of  the  test  phase,  he  often  exacerbates  them  by  embarking  on 
an  inadequate  initial  effort  which  is  essentially  wasted. 

P  System  quality  must  be  a  focal  management  concern  throughout 
a  project.  Costs  to  recover  during  testing  for  earlier  man- 
agement control  mistakes  are  normally  prohibitive.   Further, 
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indirect  costs,  which  do  not  appear  as  part  of  the  project, 
often  result  from  pressured  attempts  to  shortcut  the  test 
phase.   Such  costs  include  late  deliveries  and  resulting 
slipped  schedules,  delivered  system  degradations  and  associ- 
ated spiraling  life-cycle  support  costs  as  well  as  difficul- 
ties in  funding  follow-on  projects  due  to  mistrust  of 
presented  estimates  and  fear  of  further  overruns. 

Recently,  significant  efforts  (referenced  later)  to  im- 
prove software  management  performance  have  centered  on 
recognition  of  software  complexity  as  a  quality  and  cost 
determinant.   If  complexity  can  be  measured,  controlled 
(e.g.,  by  threshold)  and  shown  to  reliably  predict  the 
probable  effort  required  for  error  detection  and  correction, 
an  important  tool  will  be  available  in  the  effort  to  under- 
stand and  manage  large  scale  software  development  costs. 

This  thesis  is  aimed  at  investigating  the  impact  of  com- 
plexity on  software  quality  and  costs  and  the  potential 
ability  of  management  to  exploit  this  impact.   In  conducting 
L  the  investigation,  the  cornerstone  work,  by  McCabe  in  apply- 
ing the  cyclomatic  number  from  directed  graph  theory  as  a 
measurement  proxy  for  software  structural  complexity  and  the 
supportive  experimental  work  at  the  Naval  Postgraduate  School 
supervised  by  Schneidewind  were  particularly  useful.   Further, 
field  trips  were  made  to  three  software  production  facilities 
(TRW,  Redondo  Beach,  Ca.;  Hughes  Aircraft  Co.,  Fuller ton,  Ca,; 
U.  S.  Navy's  Fleet  Combat  Direction  System  Support  Activity 
CFCDSSA)  ,  San  Diego,  Ca.)  .    These  field  trips  served  to 
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determine  current  cost  estimating  and  resource  allocation 
procedures  and  to  validate  by  interview  the  existing  confi- 
dence levels  in  complexity  or  other  cost  predictors  by  those 
currently  involved  in  this  effort.   Results  of  these  trips 
are  cited  as  appropriate.   While  user/customer  issues  are 
recognized  where  relevant,  the  perspective  of  the  develop- 
ment agency  is  emphasized. 

Chapter  II  discusses  issues  concerning  the  development 
and  control  of  large  scale  software.   Chapter  III  summarizes 
some  select  aspects  of  complexity  and  complexity  metrics 
relative  to  software.   Chapter  IV  reviews  a  recent  experiment 
relevant  to  the  application  of  complexity  measurement  theory 
to  management  practices.   Chapter  V  describes  the  resource 
estimation  problem  and  suggests  a  management  approach  to 
resource  allocation  utilizing  the  cyclomatic  number  metric 
as  a  guideline.   Finally,  Chapter  VI  offers  a  summary  and 
conclusions. 
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II.       SOFTt>rARE    DEVELOPMETSIT   AND   CONTROL 

I  A.   NATURE  OF  SOFTWARE 

Software  includes  both  the  conceptual  solution  to  a  pro- 
posed problem  and  the  documentation  required  to  translate  » 
this  solution  into  a  workable  computer  program.  Its  nature 
is  marked  by  a  lack  of  measurable  physical  characteristics. 
The  management  of  software  development  historically  suffered 
because  essential  similarities  and  differences  between  soft- 
ware development  and  traditional  hardware  design  and  produc- 

i    tion  were  not  well  understood.   Management  understanding  of 
these  comparisons  is  essential  to  controlling  software  quality. 
The  most  important  of  these  similarities  and  differences  are 
listed  belov/:   [5,  6] 

-  While  hardware  engineers  utilize  a  sequence  of  develop- 
ment prototypes  enroute  to  the  production  model,  software 
projects  often  begin  v,'ith  a  concept  that  the  first  version 
developed  v/ill  be  the  delivered  product.   This  concept  is 
naturally  reflected  in  personnel,  monetary  and  calendar-time 
estimates  and  expectations.   History  has  indicated  a  definite 
need  for  iteration  in  software  development  analogous  to  the 
hardware  development  model. 

-  The  institutionalized  sequence  of  hardware  production 
provides  natural  control  points  for  management  review  and 

P  design  freezes.   Software  development  has  no  such  natural 
points  and  often  suffers  from  changes  throughout.   Design 
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freezes  are  essential  for  ordered  software  development  and 
must  be  arbitrarily  imposed  by  management. 

-  Hardware  engineers  expect  designs  to  be  fully  tested 
by  well  understood  procedures  and  customarily  prepare  test 
plans.  Although  pressures  to  formally  test  software  are  now 
substantial,  testing  techniques  are  still  at  an  innovative 
stage  and  much  quality  evaluation  remains  highly  dependent 
upon  individual  programmers. 

-  Hardware  is  essentially  composed  of  standard  parts  with 
stable  performance  characteristics.   Software  sub-routines 
are  often  new,  innovative  and  not  fully  understood. 

-  Hardware  reliability  is  related  to  the  passage  of  time 
L  much  differently  than  software  reliability.   With  hardware, 

"An  accumulation  of  stresses  is  reached  which  causes  a  compo- 
nent to  fail,"  [6]  Conversely,  a  software  error  exists  due 
to  programmer  activity  or  inadequate  specification.   "The 

p  amount  of  time  Clabor  and  machine)  involved  in  error  detection 
and  the  probability  of  error  detection  are  a  function  of  test 

ll  time,  type  of  test,  and  choice  of  test  data."  [6]   Barring 
major  modifications,  software  boasts  an  indefinite  life, 
continuing  to  improve  (decreasing  error  rate)  with  mounting 
testing  and  use. 

-  A  software  module  with  a  detected  error  cannot  be  pulled 
off-line,  replaced  with  a  working  unit  and  repaired.   It  must 

P  be  repaired  in  order  to  fix  the  system.   (The  idea  of  fault 
tolerant  programming  incorporating  redundant  modules  has  been 
used  in  real  time  applications  requiring  high-reliability.) 
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I     -  Correction  of  a  software  fault  generally  results  in  a 
new  software  configuration. 

-  In  the  process  of  making  additional  copies  of  software, 
no  imperfections  or  variations  are  introduced  Csave  for  a 
class  of  easily  checked  copying  errors)  . 

B.   SOFTWARE  QUALITY 

The  quality  of  software  has  many  aspects.   Each  aspect 
can  become  overriding  in  importance,  depending  upon  the 
program  application  and  the  user's  intention.   During  develop- 
ment or  design  change  implementation,  ease  of  revising  (and 
verifying)  is  important.    During  deployment,  ease  of  oper- 
ating is  pararaount.   Similarly,  if  a  need  develops  to  adapt 
the  software  to  another  system  (hardware,  software  or  both) , 
ease  of  transition  will  be  an  important  attribute.   Table  II 

[7]   lists  11  software  quality  factors  within  this  framework. 

P 

Although  Table  II  does  not  necessarily  provide  a  complete 

list  of  quality  factors,  most  additional  terms  or  criteria 

W 

of  software  quality  can  be  related  to  those  described. 


19 


TABLE  II 
Software  Quality  Factors 

QUALITY  CATEGORY    QUALITY  FACTOR 

I   REVISION       (1)  Maintainability 


II   OPERATION 


III  TRAl^SITION 


(2)  Flexibility 


(3)  Testability 


(4)  Correctness 

(5)  Reliability 

(6)  Efficiency 
C7)  Integrity 

(8)  Usability 

(9)  Portability 


CIO)  Reusability 


-  DEFINITION 

-  Ease  of  locating  & 
correcting  errors 

-  Ease  of  modifying 
program 

-  Ease  of  adequately 
testing  (includes 
traceability :  ease 
of  linking  require- 
ments to  design  and 
code) 

_  Extent  to  which 
user  requirements 
are  met 

-  Extent  of  accurate 
and  consistent 
operation 

-  Relative  optimal  use 
of  computing  resources 
and  code 

-  Relative  ability  to 
control  unauthorized 
data  access 

-  Ease  of  learning, 
operating,  preparing 
input  and  inter- 
preting output 

-  Ease  of  transfer  from 
one  hardware  configu- 
ration or  system 
software  environment 
to  another 

-  Ease  of  applying  to 
other  programs 
(relative  to  packaging 
and  scope) 


(.11)  Interoperability  - 


Ease  of  interfacing 
with  another  system (s) 
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The  dominant  aspect  in  all  software  quality  factors  is  pro- 
gram complexity.   In  general,  as  program  structures  become 
more  complex,  the  probability  increases  of  encountering  diffi- 
culty in  revision,  operation  and  transition.  [1,  2,  8] 
Accordingly,  controlling  complexity  is  a  key  concern  of  manage- 
ment in  project  development.  - 

C.   DEVELOPMENT  CYCLE 

In  order  to  accurately  estimate  and/or  effectively  control 
a  large  scale  software  project,  the  development  cycle  must  be 

understood.   Although  different  authors  and  managers  vary  in 

i 

some  detail  or  nomenclature,  the  industry's  successes  and 

failures  have  distilled  a  generally  accepted  progression  of 
activities  necessary  to  produce  a  large  scale  computer  program. 
The  major  phases  of  interest  are  comprised  of  the  following: 

-  Analysis 

-  planning 

-  requirements  definition 

-  specification 

-  Design 
1^'                   -  Coding 

-  Integration  and  Testing 

-  Life  Cycle  Support/Maintenance 

Figure  2  (svibstantially  from  [5])  depicts  this  development 
cycle  in  chronological  detail.   It  is  important  to  note  the 
iterative  nature  of  this  cycle,  represented  in  Figure  2  by 

i 

connecting  arrows.   Often  events  in  one  phase,  such  as  testing, 
stimulate  reworking  of  problems  in  a  previous  phase,  such  as 
coding  or  even  design.   Additionally,  it  is  common  for  sig- 
nificant phase  overlaps  to  occur  at  certain  stages   (e.  g. 
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FIGURE  2.  SOFTWARE  LIFE  CYCLE 
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conducting  testing  prior  to  completion  of  all  coding) .   As 
mentioned  earlier,  freezing  design  at  some  point  is  an  essen- 
tial function  of  management  if  the  project  is  to  be  completed 
on  schedule,  and  at  planned-for  cost. 

1.   Analysis  (planning,  requirements  definition,        " 

specification) 

The  initial  or  analysis  phase  of  a  project  can  be 
extended  over  a  considerable  time  period  and  include  several 
key  activities.   The  user /customer  generates  requirements 
during  this  stage  and  communicates  them  to  (potential) 
project  engineers.   This  is  an  iterative  process  and  is  fre- 
quently stimulated  by  engineers  (or  marketeers)  describing 
what  is  possible  to  users.   Two  issues,  validity  of  user 
need  and  feasibility  of  solution,  must  be  resolved  prior  to 
promulgation  of  user  requirements.   The  organization's  stand- 
ard cost-effectiveness  justification  process  is  necessary 
for  the  first,  while  an  independent  feasibility  study  is 
normally  initiated  to  settle  the  second.   When  either  pro- 
cess is  circumvented,  continuity  of  future  organizational 
decisions  and  actions  is  jeopardized. 

When  user  requirements  are  articulated,  they  become 
inputs  for  resource  utilization  estimates  which,  along  with 
resource  availability  issues,  form  the  major  considerations 
of  development  agency  top  management  review.   This  review 
determines  if  the  organization  will  pursue  involvement  (e.g. 
respond  to  Request  for  Proposal)  and  must  assure  that  high 
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risk,  projects  are  discarded.  [9,  10]   Appropriate  assessment 
of  potential  system/program  complexity  is  crucial  to  the 
accuracy  of  this  risk  determination. 

If  the  decision  to  continue  is  reached,  specification 
(i.e.,  translating  requirements  to  guidelines  for  development) 
is  commenced  as  a  final  activity  in  the  analysis  phase.   A 
management  review  concluding  the  analysis  phase  avails 
development  agency  management  a  final  opportunity  to  determine 
project  continuance/termination  prior  to  major  resource 
expenditures.   Docximentary  output  of  the  specification  effort 

will  support  this  review  and  guide  future  orogress  of  the 

I 

^    project.   It  is  composed  of  detailed  administrative  and  tech- 
nical documents  which  are  meant  to  form  the  bases  of  all  user- 
developer  contracts.  [9] 

2.  Design 

The  design  phase  covers  all  remaining  efforts  required 
to  complete  the  technical  solution  in  light  of  specifications 
and  imposed  constraints.   It  culminates  in  describing  the 
best  technical  solution  in  terms  that  will  facilitate  coding. 
[11]    Short  stopping  errant  designs  is  essential  to  avoid 
massive,  costly  rework  in  a  project's  latter  stages.   Customer/ 
P  user  involvement  in  the  evaluation  process  is  mandatory  to 
ensure  continuing  communication  and  to  engender  commitment  to 
approved  designs  before  they  key  follow-on  effort. 

3.  Coding 

Coding  includes  both  the  translation  of  designs  to 
computer  language  and  the  process  of  docixmenting  developed 
programs . 
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The  coding  and  design  phases  are  particularly  inter- 
related and  can  be  interspersed  with  a  technical  review  of 
each  subroutine  to  track  and  assure  progress.   An  important 
management  review  is  often  held  at  the  end  of  coding  with 
sturanary  data  available  from  all  preceding  technical  reviews. 
As  a  rule,  a  large  percentage  of  planned  project  development 
funds  have  been  expended  at  this  stage.   With  such  a  commit- 
ment from  the  customer,  project  termination  is  rare  once 
coding  is  complete.   Thus,  while  earlier  reviews  concentrate 
on  project  continuance,  the  major  objective  now  is  "to  main- 
tain schedules  and  budget  by"  shifting  manpower  from  less 
important  activities  to  critical  tasks,  canceling  or  delaying 
features,  allowing  standard  practices  to  short-cut,  and  if  all 
fails,  to  immediately  publish  a  schedule  or  budget  "increase." 
While  these  are  management  concerns  throughout  the  project, 
they  become  particularly  germane  at  the  completion  of  coding 
when  a  genuine,  albeit  tenuous,  attempt  is  made  to  refine 
total  resoTirce  requirement  estimates.  [9] 
4,   Integration  and  Test 

This  phase  includes  the  processes  of  merging  all  sys- 
tem/software components  and  demonstrating  performance  quality. 

Daly  [9]  identifies  four  stages  of  software  testing 
as  follows: 

-  Segment  or  unit  testing  verifies  the  operation  of 
individual  design  functions  as  they  are  developed. 

-  Module  testing  assesses  segments  combined  into 
modules. 


25 


-  Integration  testing  evaluates  the  progressive 
activity  of  merging  all  software  into  a  single 
program. 

-  Systems  testing  assures  that  the  software  and  all 
associated  hardware  in  the  total  product  system 
can  function  satisfactorily  together.   During 
this  process  it  is  important  to  exercise  each 
function  under  full  load  or  stress  conditions 
such  that  the  environment  to  be  experienced  by  the 
user  is  simulated  as  closely  as  possible. 

Both  unit  and  module  testing  may  be  included  in  the  coding 
phase.   Integration  and  system  testing  are  often  duplicated, 
first  by  the  developing  agency  and  then  during  acceptance 
tests  by  the  user.   A  potential  for  time  and  money  savings 
exists  here  by  having  the  user  present  for  final  integration 
and  systems  tests.   It  is  an  important  opportunity  for  the 
user  to  gain  familiarity  with  the  program  and  confidence  in 
the  developer  and  program  quality.   Further,  such  arrangement 
may  result  in  satisfaction  of  select  acceptance  requirements 
and  thus  cut  test  time.   As  the  danger  of  compounding  existing 
disagreements  is  great,  this  opportunity  should  only  be  ex- 
ploited if,  in  the  judgement  of  management,  undue  strain  will 
not  be  placed  on  the  customer  relationship. 

Testing  requirements  must  be  written  and  agreed  upon 
very  early  in  the  development  cycle.   It  is  imperative  that 
they  reflect  user  involvement  and  represent  a  thorough  yet 
cost  effective  attempt  to  verify  system  performance.  [6] 
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5.   Life  Cycle  Maintenance 

Once  acceptance  testing  has  been  satisfactorily  com- 
pleted, and  the  system  transferred  to  the  user,  the  life 
cycle  support  or  maintenance  phase  begins.  As  Figure  3  [5]  indi- 
cates, this  activity  constitutes  a  growing  majority  of  devel- 
opment costs.   Update  maintenance,  initiated  by  changed 
specifications  resulting  from  altered  user  requirements  must 
generally  be  handled  on  demand.   Unless  such  alterations  can 
be  anticipated  through  close  involvement  with  user,  little 
can  be  done  to  minimize  these  changes.   However,  corrective 
maintenance  is  a  preventable  evil.   Improvement  techniques 
in  all  other  phases  must  be  invoked  to  minimize  the  occurrence 
of  operational  "bugs."   These  errors  are  even  more  costly  to 
correct  than  those  discovered  in  testing  for  the  following 
reasons  [6]  : 

-  Problems  are  often  more  complex. 

-  Problems  are  reported  as  system  malfunctions 

by  operators  not  knowledgeable  of  data  required 
to  duplicate  failure — effort  must  be  expended  to 
translate  problem  symptoms  into  systems  error. 
(Operator  training  may  improve  this  problem.) 

-  Problems  are  usually  addressed  by  maintenance  pro- 
grammers who  are  unfamiliar  with  program  develop- 
ment and  must  spend  excessive  time  reviewing 
detailed  code  Cnormally  not  top  personnel  [5]). 

-  Another  round  of  problem  definition,  design,  code, 
test  and  full  doc\amentation  is  initiated. 
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Figure   3.      Software  Maintenance  Cost  Growth 
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D.   S0FTV7ARE  DEVELOPMENT  HISTORY  AND  CURRENT  ISSUES 
1.   General 

The  seemingly  limitless  applications  of  computers 
forced  an  excessive  demand  on  the  productive  capacity  of  the 
software  industry  from  its  inception.   This  demand  for  results 
and  the  disjointed  response  from  many  splinter  companies  and 
work  groups  created  a  chaotic,  fragmented  grov/th  pattern. 
The  sheer  speed  of  growth  precluded  early  development  of 
stylized  professional  standards  which  could  have  aided  indi- 
vidual project  management  control.   Early  development  of  such 
standards  was  defeated  on  at  least  three  counts. 

In  the  first  place,  there  was  and  continues  to  be  a 
perception  that  programming  as  an  analytic  activity  conflicts 
with  the  intrusion  of  conventions  and  rules .   At  least  in  the 
minds  of  those  involved.   Many  of  the  field's  early  successes 
required  inspired,  innovative,  problem-solving  approaches 
which  might  well  have  been  stifled  by  the  weight  of  ponderous 
standards.  [12]    The  time  proven  brom.ide  that  standardiza- 
tion penalizes  the  best  performances  carries  much  credibility 
for  those  who  participate  in  the  analytic  process. 

Further,  the  traditional  approach  to  programming  in- 
volved much  independent  work  and  often  formed  a  strong  bond 
between  individual  programmers  and  their  programs  (which 
often  symbolized  a  massive  personal  time  commitment) .   Pro- 
grammers thus  tended  to  be  somewhat  irrationally  blinded  by 
pride  of  authorship  when  siobjected  to  criticism  of  "their" 
program.   This  work  environment  was  not  conducive 
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to  the  development  or   imposition  of  universal  program- 
ming standardization.  [12] 

Finally,  as  Paretta  and  Clark  [13]  point  out,  manage- 
ment control  was  deemphasized  and  thus  ineffective  in  early 
software  projects.   The  resulting  dysfunctional  behavior 
affected  productivity  and  product  quality  and  caused  complexi- 
ties to  abound.   Since  programming  ground  rules  were  unknown, 
managers  were  often  not  able  to  distinguish  many  aspects  of 
program  quality  (e.g.,  efficiency,  maintainability,  etc.). 
"Finding  it  difficult  to  reliably  measure  the  quality  dimen- 
sion, quantity  of  output  became  the  primary  focus  of  control.., 
The  ability  to  keep  programming  projects  on  schedule,  and  to 
complete  them  on  time  thus  became  the  two  major  criteria  by 
which  programmers  were  rewarded."  [13]   Despite  these  rewards, 
few  projects  came  in  on  time  with  acceptable  reliability.  The 
natural  response  to  such  stimuli  was  a  massive  dose  of  sub- 
optimization  manifested  by  routine  incorporation  of  shortcuts 
in  software  development.   Such  efforts  focused  on  immediate 
tangible  results  to  the  detriment  of  long  term  consequences. 
Programs  were  patched  together  with  focus  on  speed  of  comple- 
tion and  little  or  no  interest  in  final  structure  or  dociomen- 
tation.   The  proliferation  of  complex  program  structures  in 
this  environment  is  not  surprising  as  the  few  planned  struc- 
tures that  did  exist  were  soon  infested  by  layers  of  debug 
patches.   Perhaps  worst  of  all,  an  attitude  of  'damn  the 
documentation,  full  speed  ahead'  infused  itself  in  the   ^^ 
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profession's  work  habits  to  such  a  degree  that  it  remains 

one  of  the  most  serious  impediments  to  project  control 

even  today. 

"The  pressure  to  produce  working  programs  often 
meant  that  there  was  little  time  for  programmers 
to  think  about  documenting  programs.   The  documen- 
tation that  was  available  was  mostly  inadequate 
because  few  conventions  existed  for  defining  what 
should  be  included  in  program  documentation,  and 
for  determining  what  level  of  detail  was  sufficient 
to  make  it  comprehensible.   Also,  documentation 
was  usually  kept  in  the  possession  of  the  original 
programmer ,  and  not  in  a  program  library  where 
it  could  be  made  available  for  general  use.   This 
caused  great  confusion  when  one  programmer  was 
called  upon  to  perform  maintenance  on  a  program 
written  by  another,  especially  when  the  latter  was 
no  longer  with  the  firm."   [13] 

While  the  effects  of  much  of  this  early  confusion 
remain,  a  growing  effort  to  identify  and  address  such 
problems  is  evident. 

2.   Assessing  Project  Progress 

Without  doubt,  the  central  historic  issue  in  con- 
trolling software  development  has  been  the  inability  of 
management  to  successfully  assess  or  predict  progress  in 
software  development  projects.   [14,  15,  13]   As  noted 
earlier,  the  nature  of  software  is  characterized  by  the 
absence  of  physical  characteristics.   Since  software  develop- 
ment progress  must  be  measured  against  a  basically  mental 
process  of  problem  solving  with  no  tangible  outputs,  early 
project  managers  often  merely  relied  upon  either  questioning 
programmers  or  measure  of  man-hours  expended  to  determine  work 
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accomplished.   Brooks  [15]  points  out  the  folly  of  both 
practices.   Individual  programmers  are  universally  over- 
optimistic  with  regard  to  evaluating  their  own  work  and 
abilities  to  complete  a  project  quickly.   Further,  niomber  of 
man-hours  expended  fails  to  measure  the  quality  of  time  spent 
or  the  relative  ability  of  those  working  together  to  effec- 
tively communicate  and  avoid  redundant  or  conflicting 
activities.  [15,  11] 

To  gain  control,  management  mu^t  intelligently  create 
intermediate  deliverable  items  (e.g.,  specific  design  docu- 
mentation) for  which  personnel  can  be  held  accountable. 
The  quality  (format,  completeness,  etc.)  of  deliverables  can 
be  specified  by  promulgated  standards.   Assignment  and 
scheduling  of  resources  to  each  of  these  deliverables  consti- 
tutes the  milestone  approach  to  controlling  development  pro- 
jects utilized  by  most  organizations  today.   Management 
methodology  used  in  resource  estimates  and  allocations  is  still 
far  from  standard,  often  relying  upon  individual  experience. 
Pioneering  work  in  the  principles  of  predicting  resource 
requirements  and  tracking  progress  has  been  published  but  is 
not  yet  widely  used.  [e.g.  16,  17,  18] 

3.   Development  Phase  Interrelationships 

Thibodeau  and  Dodson  [19]  postulate  a  cost  prediction 
model  which  recognizes  the  impact  of  variable  phase  interrela- 
tionships on  project  utilization.   In  individual  projects, 
these  relationships  may  be  either  controllable  or  forced  by 
constraints  (of  time,  etc.).   In  either  event,  management 
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should  be  aware  of  their  probable  impact  on  schedule,  perform- 
ance and  cost.   While  actual  interrelationships  are  complex, 
the  authors  underscore  the  following  project  management  issues: 

-  Inadequate  resources  allowed  for  design  (and  to  a 
lesser  extent  coding)  activities  will  result  in  more  costly 
testing  and/or  higher  error  rates  during  life  cycle  maintenance. 

-  Planned  phase  overlaps  (or  deviations  from  the  devel- 
opment plan  that  result  in  actual  phase  overlaps)  adversely 
affect  cost-driving  variables. 

-  Software  development  activities  are  difficult  to  pre- 
cisely define  and  restrict  to  particular  phases — this  ambiguity 
can  be  exploited  in  the  process  of  cost  reporting  by  inaccurately 
tying  the  easy  to  ascertain  incurred  costs  to  the  more  difficult 
to  measure  progress  accomplished. 

4.   Quality  Doci:mientation  and  Configuration  Management 
In  effect,  software  is  documentation.   The  task  of 

building  another  program  copy  from  a  full  set  of  documentation 

would  certainly  be  trivial  compared  to  generating  a  replacement 

set  of  docxamentation  solely  from  a  program  tape. 

Further,  quality  documentation  provides  the  following 

benefits : 

-  Assures  full  value  and  control  of  product  when 
delivered  to  customer. 

-  Minimizes  duplication  of  effort  by  recording  solved 
problems. 

-  Saves  interruption  time  by  allowing  future  investi- 
gators to  research  on  own.  ^ 
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-  Compensates  for  the  departure  of  an  employee; 
consolidates  work  completed  for  the  organization. 

Paradoxically,  while  the  importance  of  documentation 
is  virtually  unchallenged  in  the  computer  industry,  the 
delivery  of  timely,  complete  and  accurate  documentation  is 
rare.   Much  of  this  failure  is  attributable  to  the  low  esteem 
from  which  docximentation  suffers  in  the  minds  of  most  pro- 
grammers. [12,  14]   "The  nature  of  programmers  is  such  that 
interesting  work  gets  done  at  the  expense  of  dull  work  and 
documentation  is  dull  work."  Ill] 

Unfortunately,  programmers  must  provide  the  bulk  of 
effort  in  documenting  their  programs  since  they  are  the  only 
available  authority  (without  significant  lead  time) .   There- 
fore, management  must  provide  an  incentive  and  control  struc- 
ture tkat  reinforces  the  importance  of  timely,  quality 
documentation.   This  is  best  done  with  firm  development 
standards  to  define  milestone  deliverables  in  detail,  refusal 
by  management  to  recognize  development  progress  without 
delivery  of  appropriate  documentation  and  the  early  institu- 
tion of  configuration  management. 

Configuration  management  is  a  control  process  which 
recognizes  the  importance  of  matching  documentation  with  soft- 
ware and  responds  to  the  dichotomy  between  the  ease  of  making 
program  changes  as  opposed  to  the  difficulty  and  tedium  in 
making  documentation  changes.   If  program  changes  are  allowed 
to  be  made  without  dociomentation,  logical  future  program 
refinements  or  corrections  will  be  impossible.   "It  is  better 
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not  to  have  any  dociimentation  than  to  have  documentation  of 
a  former  version.   Without  documentation  it  is  at  least 
clear  that  to  modify  the  program  reliably  one  should  . . . 
start  from  scratch."  [20] 

Configuration  management  is  important  throughout 
development  but  becomes  critical  in  the  integration  and  life 
cycle  support  phases  when  uncontrolled  changes  can  ruin  the 
entire  project.   When  formal  configuration  control  is  in 
effect,  each  proposed  code/documentation  change  must  be  sub- 
mitted with  justification  and  test  plan  (if  applicable)  for 
managerial  approval.   A  properly  run  configuration  control 
program  will  provide  a  developing  organization  the  following 
benefits:  [9] 

-  Software  changes  made  in  coordination  with  related 
hardware  changes. 

-■  Each  software  change  appropriately  tested  and 
documented. 

-  Design  new  versions  using  existing  software. 

-  If  multiple  versions  are  being  maintained,  ensure 
that  corrections  made  to  code  are  reflected  in 
all  common  software. 

5.   Ad  equa  t e  S  pec  i  f  i  ca  t  io  n 

Failure  during  a  project's  early  stages  to  translate 
user  requirements  accurately  and  completely  into  both  system 
and  softv/are  specifications  has  been  a  major  impediment  to 
the  success  of  many  software  developments.  [21,  22]   Incon- 
sistencies and  ambiguities  introduced  in  this  translation 
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process  allow  multiple  intepretations  during  design  and  the 
inevitable  accompanying  complex  structures  which  result  when 
design  guidance  is  allowed  to  convey  variable  meanings  to 
those  who  implement.   As  a  project  progresses,  conflicting 
development  assumptions  are  often  buried  by  short  term 
efforts  to  force  results  by  piecemeal  patching  aimed  at 
satisfying  piecemeal  specifications.   Residual  inconsistencies 
and  conflicts  inevitably  cause  major  problems  (system  degrada- 
tions and  failures)  in  integration/acceptance  testing  or 
during  system  operation,  often  with  devastating  consequences 
in  additional  resource  commitment.   Reluctance  to  produce 
formal,  quality  specifications  stems  largely  from  the  level 
of  effort  and  difficulty  involved  with  their  generation   [23] 
and  the  propensity  of  projects  to  proceed  on  their  own  momen- 
tum by  deriving  requirements  spontaneously  as  production 
needs  dictate.   Unfortunately,  these  requirements  created 
'on  the  fly'  are  often  found  to  be  in  conflict  with  true 
user /customer  desires.   This  result  is  not  surprising  since 
few  customers  plan  thoroughly  enough  to  know,  in  a  project's 
early  phases,  exactly  what  they  want,  much  less  what  words 
are  required  by  analysts/programmers  to  guide  production. 
The  traditional  result  has  been  that  specifications,  which 
should  function  as  precise  bases  for  common  agreement,  often 
"abound  with  ambiguous  terms  ('suitable,'  'sufficient,'  'real 
time,'  'flexible')  or  precise-sounding  terms  with  unspecified 
definitions  ('optimum,'  '99.9  percent  reliable')  which  are 
potential  seeds  of  dissension  or  lawsuits   once  the  software 
r  is  produced."  [5] 
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Several  difficult  obstructions  to  management  control 
ripple  throughout  a  project  if  requirements  specifications 
are  of  poor  quality.   Most  visible  of  these  is  the  positive 
growth  in  relative  cost  to  correct  errors  during  each 
succeeding  development  phase.   Figure  4  [5,  24]  depicts  sum- 
mary data  from  three  corporations  concerned  with  large  scale 
software  development.   The  wisdom  of  investing  resources  in 
a  project  to  detect  and  correct  errors  in  early  phases  such 
as  definition/specification  instead  of  relying  on  development/ 
acceptance  test  efforts  is  evidently  justified  by  quantum  cuts 
in  quality  assurance  expenses.   Further,  poor  requirements 
specifications  offer  the  following  ills: 

-  User's  inputs  are  minimized  since  no  clear 
statement  of  desires  exists. 

-  Management  has  no  chance  to  exercise  control 
since  no  clear  production  goals  are  available. 

-  No  coherent  guidance  exists  for  design  personnel. 

-  Test  plans/procedures  are  impossible  to  write  in 
good  faith  since  there  are  no  hard  criteria  for 
project  performance  available.  [5] 

Generating  useful  specifications  is  a  time-consuming 
process  for  which  the  rewards  of  quality  are  normally  not 
validated  until  the  end  of  system  development.   This  demoti- 
vating  aspect  has  in  great  part  accounted  for  the  pitiful 
specification  efforts  that  have  crumbled  beneath  so  many  pro- 
jects.  Hope  in  this  area  has  emerged  in  the  form  of  growing 
attempts  to  automate  the  specification  process.   These  efforts 
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include  Teichroew's  work  with  PSL/PSA  (problem  statement 
language/problem  statement  analyzer),  [25]  Ross'  Structured 
Analysis  [26]  and  TRW' s  SREM.  [24]    With  computer  assistance, 
such  systems  are  taking  direct  aim  at  eliminating  or  reducing 
the  ambiguities,  inconsistencies  and  omissions  which  have 
universally  plagued  specification  generation.   Their  increa- 
sing use  by  development  agencies  is  an  encouraging  indica- 
tion of  progress.   TRW  has  developed  and  is  continuing  to 
perfect  SREM.   Variants  of  both  SREM  and  PSA/PSL  are  under 
evaluation  at  FCDSSA.   Hughes  personnel  have  worked  on  a 
Design  Analysis  System  (DAS)  which  incorporates  PSL/PSA  in  an 
interactive,  graphics  oriented  system  supporting  requirements, 
operations  and  software  design  verification.   Figure  5  [27] 
depicts  the  innovative  and  ambitious  DAS  concept. 
6.   Top-Down  Design 

In  the  perfect  project  progression,  all  specification 
documents  would  be  complete  prior  to  the  design  phase.   The 
design  would  then  take  form  rather  easily  from  precise  speci- 
fications.  For  practical  reasons,  this  is  almost  never  the 
case.   To  feel  comfortable  with  cost  estimates,  management 
has  traditionally  initiated  one  or  more  software  designs  prior 
to  the  continuance  review  at  the  completion  of  the  specifica- 
tion effort.   This  rational  demand  for  more  information 
earlier  is  termed  "The  requirements/design  dilemma"  by  Boehm 
[5]  and  is  generally  justifiable  in  the  pursuit  of  improved 
estimation  data.   Unfortunately,  this  trend  is  often  parlayed 
into  a  "bottom-up"  approach  to  design  wherein  software 
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components  are  actually  developed  prior  to  appropriate 
consideration  of  potential  interface  and  integration  prob- 
lems.  Existing  software  components  then  drive  the  remaining 
design  effort.   This  backward  approach  to  design  has  survived 
for  so  long  because  there  was  little  formal  knowledge  of  the 
software  design  process  or  what  makes  a  good  software 
designer. 

As  in  specification,  design  procedures  developed 
spontaneously  in  the  early  'cottage  industry,'  with  no  mana- 
gerial guidance,  are  inadequate.   The  problem  has  been  one  of 
pressure  to  get  on  with  the  project  and  the  result  has  often 
been  incomplete  designs  which  cause  errors  that  are  detected 
later  in  the  project  when  cost  to  correct  is  highest.   Reli- 
ability and  life  cycle  costs  suffer  irreparably  in  this 
process.   "More  emphasis  needs  to  be  placed  on  software  design 
so  that  the  product  is  more  reliable,  less  costly  to  maintain 
and  easier  and  less  costly  to  operate.   So  often,  in  the 
expediency  of  getting  a  product  out  of  design,  these  factors 
are  totally  neglected  to  the  later  dismay  of  the  user,  when 
he  discovers  how  much  it  costs  to  maintain  and  operate  his 
new  system."  [10] 

'Top-down'  design,  as  practiced  by  a  growing  nijmber  of 
projects  [23,  28,  30]  seems  to  make  much  more  sense  in  terms 
of  projecting  and  maintaining  control.   "It  begins  with  a  top- 
level  expression  of  a  hierarchal  control  structure  (often  a 
top-level  'executive'  routine  controlling  an  'input,'  a 
'process,'  and  an  'output'  routine)  and  proceeds  to  iteratively 
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refine  each  successive  lower-level  component  until  the  entire 
system  is  specified.   The  successive  refinements,  which  may  be 
considered  as  'levels  of  abstraction'  or  'virtual  machines,' 
provide  a  number  of  advantages  in  improved  understanding, 
communication  and  verification  of  complex  designs."  [5]    If 
specification  modifications  are  required  in  later  stages  due 
to  changing  user  needs,  disruptions  will  normally  be  minimized 
since  corrections  are  restricted  to  lower-level  code.   (Higher- 
level  code  should  require  no  modification  as  long  as  the  major 
purpose  of  the  program  remains  intact.  [10])   Beyond  the  testi- 
monial evidence  of  several  completed  projects,  an  indication 
of  the  labor/cost  savings  potential  of  top-down  design  was 
provided  in  initial  experiments  conducted  by  Comer  and  Halstead. 
[29]    The  product  of  an  emphasis  on  complete,  timely  and 
quality  design  is  the  ability  to  focus  early  on  the  potentially 
most  challenging  project  problem  areas  (e.g.,  interface  defini- 
tion and  test  strategies) . 

Previously  mentioned  automated  specification  techniques 
facilitate  the  top-down  concept  by  providing  "a  medium  for  im- 
proved communication  between  the  proponent  (user) ,  designer, 
coder  and  maintainer. . ."  [30]    Also  of  note  are  the  efforts 
to  improve  design  representation  over  the  traditional  flow 
charts.   (E.g.,  The  hierarchal  input-process-output  (HIPO) 
technique  produces  easy-to-understand  graphics  which  represent 
softv;are  in  a  hierarchy  of  modules,  each  of  which  is  symbolized 
by  its  input,  its  output  and  a  summary  of  the  connective 
processing.  [5,  3] ) 
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7.   Ordered  Program  Structiires 

The  historic  causes  (lack  of  standardization,  pressures 
to  produce  quickly,  inadequate  documentation,  etc.)  and  result- 
ant ills  of  complex  coding  structures  have  been  mentioned. 
Several  techniques  have  been  proposed  and  utilized  to  simplify 
these  structures. 

a.  Structured  Coding  (or  Structured  Programming) 
The  theory  of  structured  coding,  developed  by 

Dijkstra  [32]  and  expanded  into  a  set  of  techniques  by  him  and 
others,  is  now  in  widespread  use  (including  at  the  three 
facilities  visited:   TRW,  Hughes  and  FCDSSA) .   The  most  signifi- 
cant feature  of  structured  coding  is  the  recognition  that  an 
excess  of  branching  statements  contributes  enormously  to 
structiiral  complexity.   With  this  realization  in  mind,  program 
modules  are  limited  to  single  points  of  entry  and  exit  and 
branching  statements  within  modules  are  strictly  controlled. 
Following  these  techniques  maximizes  sequential  logic  flow  and 
contributes  greatly  to  readability  and  the  enhancement  of  all 
revision  quality  factors  by  simplifying  and  standardizing  pro- 
gram constructs. 

b.  Chief  Programmer  Team 

The  Chief  Programmer  Team  (CPT)  concept  [33,  34] 
is  analogous  to  a  surgeon  surrounded  by  a  staff  of  specialists 
whose  function  it  is  to  maximize  his  performance.   The  chief 
programmer  similarly  acts  as  an  expert  surrounded  by  program- 
mers who  improve  his  efficiency  by  accomplishing  all  routine 
tasks  and  free  the  expert  to  concentrate  on  the  most  difficult 
aspects  of  the  project. 
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"The  chief  programmer  team  represents  a  managerial 
approach  to  program  development  that  offers  some  needed 
relief  from  the  problems  of  organizational  structure... 
The  emphasis  in  chief  programmer  teams  is  on  producing 
programs  that  are  well  designed  by  taking  advantage 
of  experienced  programming  talent,  rather  than  delegating 
important  programming  functions  to  inexperienced  pro- 
grammers on  a  'sink  or  swim'  basis.   Because  the  team  is 
organized  around  experienced  programmers,  projects  can    a 
develop  more  quickly  and  with  more  direction  than  when 
conventional  staffing  approaches  are  used.   Instead  of 
just  being  part  of  a  poorly  led  thundering  herd  of  junior 
programmers,  each  member  of  the  team  is  a  specialist 
who  makes  an  individual  contribution  to  the  project  under 
the  close  direction  of  the  chief  programmer.   The  arrange- 
ment enables  better  utilization  of  personnel,  reducing 
the  number  of  people  involved  in  a  programming  project. 
Not  only  does  this  generate  immediate  cost  savings,  it 
also  suppresses  the  numerous  communication  and  coordination 
problems  so  often  associated  with  software  projects.   As 
an  active  participant  in  all  stages  of  development,  the 
chief  programmer  is  also  in  a  better  position  to  evaluate 
the  headway  the  team  is  making  on  a  project.   His  direct 
involvement  means  he  does  not  have  to  rely  on  tangible 
evidence  to  gauge  a  project's  progress."  [13] 

A  modified  implementation  of  CPT  by  Naval  Air  Develop- 
ment Center,  Warminster  for  the  CVTSC  software  project  noted 
positive  results  in  maintaining  design  consistency  and  mini- 
mizing integration  problems  "which  arise  from  conflicting 
implementations."  [35]    On  the  negative  side,  this  approach 
may  be  limited  by  the  manning  available.   Further,  it  is 
doubtful  that  a  career  programmer  will  desire  to  spend  more 
than  a  few  projects  functioning  at  the  absolute  direction  of 
the  "Chief  Programmer." 

c.   Program  Walkthrus  -  Egoless  Programming 

Weinberg  [12]  articulated  the  problems  created  by 
ego  involvement  of  programmers  with  any  code  that  they  produce. 
"A  programmer  who  truly  sees  his  program  as  an  extension  of 
his  own  ego  is  not  going  to  be  trying  to  find  all  the  errors 
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in  that  program.   On  the  contrary,  he  is  going  to  be  trying  to 
prove  the  program  is  correct — even  if  this  means  the  oversight 
of  errors  which  are  monstrous  to  another  eye."   To  combat  this 
blinding  and  destructive  link  between  programmers  and  their 
code,  "program  walkthrus"  have  been  instituted.   In  this  tech- 
nique a  review  group  of  the  programmer's  peers  (i.e.,  no 
management  personnel)  scrutinize  code  in  detail  prior  to 
running  it  on  a  computer  in  order  to  detect  errors  as  early 
as  possible.   Key  to  such  proceedings  is  the  atmosphere  of 
correcting  'our'  product  and  never  of  attacking  'your'  pro- 
gramming skill.   Reviewing/presenting  roles  must  be  rotated 
to  avoid  pressure  build-up  from  constant  review.  [13] 
8.   Test/Integration 
,  ,     Testing  and  debugging  large  scale  software  remains  the 
most  tedious,  frustrating,  expensive  and  unpredictable  phase 
of  development.   Despite  massive  expenditures,  testing  suc- 
cesses remain  limited,  by  virtue  of  the  overwhelming  size  and 
complexity  of  many  large  scale  systems.   Operation  software 
is  never  completely  free  from  error.   Proof  of  the  ineffective- 
ness of  past  and  current  testing  techniques  are  the  inevitable 
residual  errors  that  occur  after  the  most  rigorous  testing 
available:   (e.g.,  "Software  systems  used  for  the  Apollo 
manned  spaceflight  program  are  probably  one  of  the  most 
thoroughly  tested  programs  in  the  world.   Yet  software  failures 
were  detected  in  Apollos  8,  11  and  14."  [36]) 

As  in  specification  and  design,  initial  industry 
attempts  to  predict/guarantee  satisfactory  software  operation 
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(i.e.,  reliability)  were  somewhat  misguided.   Specifically, 
the  differences  between  the  well  understood  engineering  prin- 
ciples regarding  hardware  failure  and  repair  and  the  phenome- 
nology of  software  errors  and  correction  were  not  fully 
appreciated.   The  result  of  these  differences  was  a  general 
misapplication  of  assumptions  concerning  required  level  and 
type  of  test  effort  for  software  products.   The  effects  of 
these  misunderstandings  are  manifested  in  the  dramatic  increase 
in  the  ratio  of  actual  to  predicted  costs  to  maintain  pro- 
grams— i.e.,  to  correct  designs  and  debug  residual  errors 
remaining  in  operational  software  after  satisfactory  comple- 
tion of  testing.   (Figure  3  [5]  depicts  this  growth.   Note: 
Maintenance  costs  also  include  update  design  changes.) 

Analysis  of  the  growing  body  of  data  concerning  soft- 
ware errors  is  now  providing  a  number  of  germane  insights  into 
their  nature  which  should  be  closely  considered  in  future 
projects.  [5]   These  insights  include  the  following: 

-  Program  complexity  is  a  major  factor  in  the 
propensity  of  making  programming  errors  and  the 
level  of  effort  required  to  detect  and  correct. 
[1,  37] 

-  The  development  of  test  plans  should  begin  as 

soon  as  possible  after  specification.   This  early 

I 

development  can  pinpoint  inconsistencies  and 

omissions  in  the  software  specifications.  [9] 

-  Testability  should  be  an  important  consideration 
in  program  design  and  architecture.  [38] 
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-  Specification  should  be  accomplished  with 
potential  structural  complexity  and  ease  of 
testing  as  prime  considerations.  [39,  40] 

-  A  series  of  automated  aids  for  test  generation 
and  program  evaluation  under  current  development 
or  appraisal  have  shown  excellent  potential  for 
improving  program  quality  and  reducing  develop- 
ment costs.  [36] 

9.   Verification  and  Validation  (V&V)  [6] 

Concern  for  assuring  quality  in  large  scale  programs 
has  led  to  the  development  of  a  systematic  process  of  ana- 
lyzing and  testing  documentation  and  code.   This  process 
takes  its  name  from  its  two  aims: 

Verification  -  The  determination  that  each  develop- 
ment phase  satisfies  formal  and 
logical  requirements  of  preceding 
phases. 
Validation   -  The  determination  that  the  developed 

software  and  documentation  satisfies 
all  performance  requirements. 
(The  term  validation  is  used  in  several  different 
senses  in  the  somewhat  related  fields  of  Department  of  Defense 
(DOD)  system/software  acquisition  and  software  development. 
These  differences  should  be  understood. 

-  A  requirements  'validation'  activity  occurs  in  the 

first  (conceptual)  phase  of  DOD  system  acquisition.   This 

activity  addresses  the  legitimacy  of  defined  requirements  to 

satisfy  stated  needs. 
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-  The  second  phase  of  DOD  system  acquisition  is  termed 
the  'Validation  Phase.'   Here  'validation'  refers  to  the  con- 
ceptual proof  that  the  solution  (e.g.,  preliminary  system 
design)  is  ready  to  proceed  into  full  scale  development.  [7] 

-  As  used  in  'V  &  V,'   'validation'  is  a  set  of  activi- 
ties that  occur  during  the  test  and  integration  phase  of 
software  development.) 

A  properly  implemented  V  &  V  program,  invoked  in  a 
project's  earliest  stages,  can  both  assure  software  quality 
and  aid  management  in  assessing  development  progress.   As  in 
all  quality  assurance  activities,  the  program  must  be  accom- 
plished by  a  technically  competent,  independent  team  having 
no  political  connections  with  the  development  group.  [40,  41] 
Specific  techniques  utilized  by  the  V  &  V  team  can  be  adapted 
to  the  particular  program  characteristics  (real  or  non-real 
time,  scientific  or  business,  algorithmic  or  logic  intensive) 
and  depend  upon  a  case-by-case  cost-effectiveness  determina- 
tion.  Information  derived  is  useless  unless  fed  back  for 
timely  management  review  and  utilized  to  key  iterative  improve- 
ments to  deficient  areas.   A  general  chronological  list  of 
objectives  and  possible  techniques  is  included  below: 

—•  Requirements  Verification  -  Analyze  each  require- 
ment for  criticality,  risk,  testability,  and  impact 
on  software. 

-  Set  up  mechanism  to  assure  traceability. 
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May  use  problem  statement  languages,  correctness 
proofs,  truth  table  exercises,  abstract  simulations, 
Design  Verification  -  Examine  design  logic, 
structure,  data  base  design,  architecture  and 
documentation  considering  impact  on  all  revision, 
operation  and  transition  quality  factors. 
(Correctness,  efficiency  and  usability  are 
emphasized. ) 

May  use  special  design  languages,  analytic 
techniques,  special  simulators  and  models.  [41] 
Code  Verification  -  Much  iteration  between  Design 
and  Code  efforts  expected. 

Inspect  code  to  ensure  design  goals  are  followed, 
complex  structures  minimized,  organization's 
procedures  followed. 

May  use  inspection,  automated  analysis  aids 
(e.g.,  static/dynamic  analyzers,  standards 
enforcers,  data  base  verifiers) ,  automated 
tracing  mechanism,  emulators,  code  level 
simulators. 

Validation  -  Parallels  test  and  evaluation. 
Includes  both  monitoring  of  developer's  test 
efforts  and  independent  tests. 
All  quality  factors  important  but  emphasis  is 
on  correctness  and  reliability. 

Continuing  thread  of  traceability  from  require- 
ment to  design  to  code  to  test  is  a  key. 
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The  V  &  V  concept  is  enjoying  increased  visibility  and  imple- 
mentation, particularly  in  DOD-related  projects.   Whether  or 
not  its  nomenclature  is  formally  used  in  all  or  part  of  an 
individual  quality  control  program,  quality  assurance  goals 
and  limitations  remain  the  same.   Quality  is  a  function  of 
the  complete  development  cycle  and  cannot  be  tested  or  moni- 
tored into  a  system.   A  rigorous  review  and  audit  function  is 
only  as  good  as  the  effectiveness  of  its  feedback  loop  in 
causing  timely  product  and  process  improvements.  [6]    The 
potential  of  a  quality  assurance  organization's  success  is 
thus  defined  by  the  extent  of  promotion  and  backing  it 
receives  from  management  policy  and  action. 
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III.   COMPLEXITY 

A.   GENERAL 

The  essence  and  impact  of  complexity  as  it  relates  to       V-'''^'^ 
computer  programming  is  a  difficult  concept  to  convey  and 
quantify.   Despite  the  difficulty,  widespread  recognition 
that  a  better  understanding  of  this  relationship  will  doubt- 
less lead  to  improved  management  and  an  accompanying  reduction 
of  software  development  costs  has  stimulated  a  growing  descrip-  ^^ 
tive  effort  in  the  literature.   In  this  chapter,  an  attempt 
will  be  made  to  consolidate  and  extend  the  major  thrust  of 
these  ideas. 

The  traditional  concepts — extent  of  varietal  content  and 
degree  of  interrelationship — continue  to  be  germane.   However, 
difficulties  have  arisen  in  applying  these  concepts  to  system 
and  software  assessment  and  management.   The  description  of  a 
particular  aspect  of  complexity  is  often  accompanied  by  a 
metric — i.e.,  a  method  of  qualification  (by  measuring  a 
surrogate)  designed  to  provide  an  indication  of  the  extent  of 
complexity  present  in  a  problem-solving  process,  computer 
program  or  system.   When  a  particular  metric  is  heavily  used 
in  a  production  or  research  project,  language  often  becomes 
relaxed  and  the  distinction  is  sometimes  lost  between  the 
abstract  degree  of  complexity  present  and  the  explicit  attempt 
to  measure  one  of  its  manifestations.   Since  a  potential  for 
false  indication  exists  with  all  surrogate  measures,  this 
distinction  should  be  considered  in  the  interpretation  of 
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each  metric.   The  most  trivial  complexity  metric  is  merely 
the  number  of  source  statements  present  in  a  program.  Although 
this  metric  is  hardly  accurate  alone,  large  programs  are 
generally  more  complex  than  smaller  ones  and  size  is  sometimes 
useful  in  gauging  the  meaning  of  other  metrics. 

The  following  section  lists  a  number  of  methods  devised  to 
classify/quantify  various  facets  of  complexity. 

B.   TYPES  OF  COMPLEXITY 

1.   Conceptual  and  Software  Complexity 

Conceptual  complexity  refers  to  the  level  of  diffi- 
culty associated  with  conceiving  and  solving  the  real  world 
problem.   Software  complexity  covers  the  form  and  structure 
that  results  when  this  solution  is  translated  to  a  computer 
language.   While  these  two  aspects  are  not  independent,  their 
functional  relationship  is  neither  simple  nor  consistent. 
Indeed,  the  most  trivial  of  concepts  can  be  transformed  into 
a  computer  program  so  complex  as  to  confound  all  efforts  to 
trace  logic  flows,  find  errors  or  make  minor  modifications. 
Conceptual  complexity  is  important  to  project  management  and 
must  be  considered  appropriately  in  terms  of  manpower  mix, 
etc.   However,  it  is  the  inability  to  understand  and  control 
software  complexity  which  has  traditionally  been  the  downfall 
of  major  projects.   Classifications  of  computer  related  com- 
plexity have  generally  either  attempted  to  clarify  or  to 
further  subdivide  conceptual  and  software  complexity. 
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2.      Corttputational  Complexity 

Computational  complexity  is  the  level  of  involvement 
and  difficulty  associated  with  computing  functions.  [42] 
Work  in  this  field  deals  with  quantitative  aspects  of  computed 
solutions,  recursive  function  theory  and  analyses  of  computa- 
tional models  like  the  Turing  machine,  [e.g.  43,  44]    In 
relation  to  computer  programs,  computational  complexity 
metrics  generally  provide  data  relevant  to  some  program 
resource  usage.   CPU  run  time  and  core  usage  were  among  the 
first  concerns  of  programmers  and  directed  early  attention 
to  these  manifestations  of  complexity.   (As  multiprocessed 
and  time  shared  computer  systems  evolved,  other  measures 
(e.g.,  channel  usage,  device  usage,  secondary  storage  require- 
ments, supervisor  usage,  etc.)  became  important  considerations. 
[45]    While  these  usage  measures  are  related  to  complexity, 
they  are  generally  not  considered  direct  manifestations . ) 

In  describing  computational  complexity  in  logic  cir- 
cuits and  the  Turing  machine.  Savage  [46]  identifies  the 
following  complexity  measures: 

-  Computational  complexity ;   a  measure  of  the  'size' 
of  a  logic  circuit.   "The  combinational  complexity 
of  a  function  f  relative  to  a  basis  fi  (set  of 
Boolean  functions  such  as  AND,  OR  &  NOT) ,  denoted 
C-.(f),  is  the  minimvmi  number  of  elements  from  Q, 
needed  to  realize  f  with  a  logic  circuit." 
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-  Delay  complexity;   a  measure  of  the  'depth'  of  a 
logic  circuit.   The  depth  of  a  combinational 
machine  is  "equal  to  the  number  of  logic  elements 
on  the  longest  (directed)  path  from  inputs  to 
outputs.   The  delay  complexity  of  f  with  respect 
to  n  is  the  depth  of  the  smallest  depth  circuit 
over  fJ  for  f . " 

-  Turing  machine  program  complexity:   the  length  of 
shortest  length  program  for  a  function 

:  f :  {0,1}^->{0,1}"^  on  a  Turing  machine. 

3.  Psychological  Complexity 

Psychological  complexity  concerns  characteristics  of 
an  individual  program  which  make  it  difficult  to  understand 
and  manipulate.   "...psychological  complexity  assesses  human 
performance  on  programming  tasks."  [42] 

4.  Subjective  Metrics 

In  the  early  phases  of  a  project,  predicted  complexity 
must  be  based  upon  the  subjective  evaluations  of  early  plan- 
ners.  Many  organizations  rely  almost  wholly  upon  prediction 
by  experienced  analysts  and  programmers  for  cost  estimates 
and  follow-on  planning  data.   Such  predictions  take  into 
account  similarities  and  differences  with  past  projects  and 
naturally  vary  from  individual  to  individual  or  group  to  group. 
Subjective  complexity  ratings  may  simply  be  expressed  by  quali- 
ty descriptors  (e.g.,  'extremely  complex,'  'very  complex,' 
etc.)  or  may  be  translated  to  rank  numbers  (etc.  from  1  to 
5),  depending  upon  the  requirements  stated  by  managers.  While 
such  approaches  may  be  useful  in  preliminary  cost  estimates 
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when  little  time  or  concrete  data  is  available,  their  lack 
of  precision  and  susceptibility  to  individual  bias  generally 
make  them  unacceptable  for  detailed  planning,  resource  allo- 
cation or  test  strategy  formulation.   Despite  these  apparent 
weaknesses,  many  organizations  have  yet  to  progress  beyond 
subjective  appraisals  of  complexity.  ^— '^ 

5.  Gilb  Metrics 

Gilb  [47]  proposed  a  methodology  to  measure  and  compare 
logical  and  structural  aspects  of  complexity  in  various 
systems.  [48] 

-  Logical  complexity;   the  extent  of  decision-making 
logic  within  a  program  or  system.   The  metric 
considers  "absolute  logical  complexity"  (C^  = 
number  of  nonnormal  exits  from  a  decision  statement 
(IF,  ON,  AT  END,  etc.)  and  "relative  logical 
complexity"  (c^  =  ratio  of  C_  to  total  number  of 
instructions) . 

-  Structural  complexity;   degree  of  interrelationships 
between  subprograms  or  subsystems.   The  metric 
considers  "absolute  structural  complexity" 

(C„  =  nximber  of  modules  or  subsystems)  and  "relative 
structural  complexity"  (Cg  =  ratio  of  module/ 
subsystem  linkages  to  the  total  number  of  modules/ 
subsystems) . 

6.  Thayer  Complexity  Model 

Thayer  [49]  offers  consideration  of  various  measurable 
complexity  surrogates,  both  separately  and  together  (via 
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weighted  formula) ,    to  understand  the  error  proneness  and 
probable  difficulty  of  error  detection  and  correction  in  a 
program.  [48] 

-  Logic  complexity  metric  (referred  to  as  Total  Logic 
Complexity,  L  _„)  can  be  numerically  evaluated  for  each 
routine  by  calculating: 


where 


LS  =  number  of  logic  statements 

EX  =  number  of  executable  statements 

Lynz-Ma  =  computed  loop  complexity  for  the  routine 

in  accordance  with  the  following  equation 

(values  scaled  by  x  1000) : 


LOOP      1  1  ' 


where 


and 


Q 


W.  =  4^"-'-  -^ so  that   E   W.  =  1  , 

4Q  -  1  i=l   ^ 


m.  =  number  of  loops  in  routine  at  indenture 
or  nesting  level  i 

W.  =  weighting  factor 

Q  =  maximum  level  of  indentures  in  the  system 
4   =  shaping  value  . 

L_   =   computed  IF  complexity  (niamber  of  IF  state- 
ments, nesting  level)  in  accordance  to  the 
following  equation  (values  scaled  by  x  1000) : 

L^.^  =  Zn.W.  , 
IF      1  X  '        , 


56 


where 

n.  =  niomber  of  IFs  in  routine  at  indenture  or 
nesting  level  i 

W.  =  weighting  factor 
Li,„  =  number  of  branches  BR,  times  0.001  . 

-  Interface  complexity  metric  (C^   )  can  be  numerically 
evaluated  by  calculating: 

where 

AP  =  number  of  application  program  interfaces 

SYS  =  number  of  system  program  interfaces 

0.5  =  estimated  interface  weighting  factor. 

-  Computational  complexity  metric  (CC)  can  be  numeri- 
cally evaluated  as  follows: 

CC  =  (CS/EX) • (LgYg/^CS) -CS  , 

where 

CS    =  niamber  of  computational  statements 

Lgyg   =  ELppQ-,,  (total  logic  complexity  for  each  routine) 

CS    =  the  sum  over  all  routines  of  the  values  of  CS 
for  each  routine  . 

-  Input/output  complexity  metric  (C-.,-^)  is  defined  for 
each  routine  as  follows: 

^10=  (Vo/^^^-^^SYs/^^W^I/O  ' 

S  ,  =  number  of  input/output  statements 

iSy  ,^   =  sum  over  all  routines  of  the  values  of  S  -^  for 
'     each  routine.  ^ 
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where 


-  Readability  (U_,„^_)  is  defined  for  each  routine  as 

— ^— — — — ^—    R£AD 


follows: 


"read  ^  COM/(TS  +  COM)  , 


where 


TS  =  total  number  of  statements  (executable  plus 

nonexecutable,  exclusive  of  comment  statements) 

COM  =  number  of  comment  statements. 

-  Total  complexity  CC___,)  combines  all  factors  as  follows ; 

^TOT  =  ^TOT  ^  0-l^INF  +  ^  '  2CC  +  0  .  4C^/^  +  (-0  . 1)  U^^^ 

7.   McCabe  Graph  -  Theoretic  Complexity  Model  [50,  51,  48] 
a.   Graph  Model  of  Programs 

McCabe  [50,  51],  Schneidewind  [1]  and  others  have 
noted  the  validity  of  utilizing  the  graph  model  to  represent 
computer  program  structure.   Briefly  defined,  a  graph  of  a 
program  is  composed  of  a  set  of  nodes  connected  by  a  set  of 
directed  arcs.   The  nodes  represent  statements  or  elements  of 
a  program  while  arcs  represent  program  control  flow. 

Figure  6  [1]  shows  a  graphic  representation  of  a 
simple  program  which  includes  several  basic  program  constructs. 
In  analyzing  control  flow  from  a  given  node,  'successor'  or 
'predecessor'  nodes  are  determined  by  the  indicated  directions 
of  connecting  flow.  [45] 

The  most  significant  benefit  of  the  directed  graph 
model  is  the  attendant  ability  to  measure  certain  complexity 
surrogates  related  to  the  graphic  representation.   These  meas- 
ures can  then  be  used  to  control  complexity  and  develop  optimal 
testing  methodology. 
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FIGURE  6 


DIRECTED  GRAPH  REPRESENTATION  OF 
A  SIMPLE  PROGRAM 
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b.  Cyclomatic  Complexity  Metric 

McCabe  [50]  defines  cyclomatic  number  V  of  graph 
G  with  n  vertices,  e  edges  and  p  connected  components  as  follows; 

V(G)  =  e  -  n  =  p 
By  limiting  application  of  this  definition  to  single  entry  and 
exit  programs,  an  equivalence  between  V  and  the  maximum  number 
of  linearly  independent  circuits  is  asserted. 

Schneidewind  [1]  extends  this  interpretation  as 
follows: 

"Since  V  is  equal  to  the  nvimber  of  independent 
circuits,  it  is  equal  to  a  set  of  sub-structures  which  can  be 
identified  in  a  directed  graph.   When  structured  programming 
techniques  are  used,  the  independent  circuits  are  identified 
with  the  constructs  of  structured  programming:   While  DO,  • 
IF  THEN,  IF  THEN  ELSE,  etc."   Further,  "...by  generating  all 
circuits  from  the  fundamental  circuits,  the  different  execu- 
tion sequences  v/hich  must  be  tested  can  be  identified. 
Secondly,  the  frequency  of  occurrence  of  an  arc  in  the  circuits 
indicates  the  relative  importance  of  testing  the  arc." 

c.  Other  Directed  Graph  Related  Complexity  Metrics 

-  Reachability  (R) :   summation,  over  the  nodes, 
of  the  number  of  available  ways  to  reach  a 
node.   (Average  reachability  (r)  =  R/#  of  nodes.) 

-  Number  of  Paths  (N  )  :   minimxam  number  of  paths 

(i.e.,  no  loop  traversed  more  than  once  in 
succession) . 
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-  Number  of  Nodes  (N  ) 

-  Number  of  Arcs  (N  ) 

8.   Halstead  Metric 

Halstead  [e.g.  52]  proposed  and  refined  a  comprehensive 
discipline  concerning  "measurable  properties  of  written  material 
expressed  either  in  computer  program  or  in  prose."  [53]    The 
chief  tenet  of  this  discipline  (now  known  as  Software  Science) 
is  the  application  of  natural  science  methodology  to  investi- 
gate characteristics  of  written  communication.   With  regard  to 
software  complexity,  Halstead  reported  an  important  metric  to 
gauge  program  difficulty  which  took  into  account  the  variety  of 
instructions  (vocabulary)  and  their  frequency  of  usage  (length) . 
Instructions  were  subdivided  by  operator  codes  and  operand 
addresses.   The  Halstead  effort  metric  (E)  is  calculated  as 
follows: 

E  =  n^N2(N^+N2)log(n;L'^n2) 

where 

n-,  =  number  of  unique  operators 
ri-  =  number  of  unique  operands 
N,  =  total  frequency  of  operators 
Np  =  total  frequency  of  operands  . 

This  value  indicates  the  number  of  mental  comparisons  required 
to  generate  a  program.   Follow-up  experimental  work  has  found 
significant  correlation  between  Halstead 's  metrics  and  such 
measures  of  programmer  performance  as  program  errors,  program 
quality  and  time  to  program.  [54] 
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9 .   System  Complexity 

Much  of  the  theory  developed  around  aspects  of  concep- 
tual and  software  complexity  can  be  abstracted  and  applied  to 
the  organization  and  structure  of  systems.   As  an  example, 
the  directed  graph  model  might  be  utilized  to  represent  a  sys- 
tem structure  with  communication  paths  translated  into  arcs 
and  modules  translated  into  nodes.   A  cyclomatic  niomber  anal- 
ysis can  then  be  used  to  indicate  the  more  complex  system 
structures  and/or  used  in  the  system  design  process  to  main- 
tain ordered  system  structure. 
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IV.   THE  NAVAL  POSTGRADUATE  SCHOOL  CNPS)  EXPERIMENT 

A.   PURPOSE 

As  indicated  in  the  previous  chapter,  nximerous  theoretic 
approaches  to  defining  and  measuring  complexity  have  been 
proposed.   While  these  approaches  are  useful  in  understanding 
complexity  relative  to  the  programming  task,  many  of  them  are 
difficult  to  apply  directly  to  management  control,  either  be- 
cause they  are  too  subjective  (e.g.,  psychological  complexity), 
because  they  require  data  that  is  unavailable  until  the  pro- 
ject is  essentially  complete  (e.g.,  the  Halstead  Metric)  or 
because  they  have  not  yet  been  sufficiently  corroborated  by 
empirical  data.   In  an  important  step  to  address  this  opera- 
tional requirement,  Schneidewind  [1]  directed  an  experiment 
conducted  by  Hoffman  at  the  Naval  Postgraduate  School  (NFS) 
designed  to  provide  quantitative  data  in  support  of  the 
following: 

-  The  hypothesis  that  complexity  is  a  significant 
determinant  of  both  the  propensity  to  commit  pro- 
gramming errors  and  the  time  required  to  detect 
and  correct  existing  errors. 

-  If  the  hypothesis  is  true,  a  determination  of  valid 
complexity  measure (s)  to  predict  probability  of 
programming  error  commission  and  the  difficulty  of 
error  detection/correction. 
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Detailed  methodology  and  results  of  this  experiment  are 
available  [1,  37]  and  will  not  be  covered  here.   However,  for 
continuity  of  discussion,  a  brief  overview  of  the  experiment 
and  its  potential  application  is  presented. 

B.   APPROACH  AND  RESULTS 

Corroboration  and  extensions  of  the  previously  cited  work 
by  McCabe  [50]  concerning  cyclomatic  numbers  and  other  meas- 
ures were  primary  concerns  of  the  NPS  experiment.   In  conduc- 
ting the  experiment,  four  projects  were  programmed  by  Hoffman 
as  part  of  his  Masters  in  Computer  Science  Degree  requirements. 
[37]   The  work  was  accomplished  in  ALGOL  W  for  IBM  360/370 
execution.   Such  software  engineering  concepts  as  top-down 
design  and  structured  walkthrus  were  used  throughout.   Error 
categories  were  broken  down  in  comprehensive  detail.   Informa- 
tion was  then  collected  concerning  the  design,  coding,  debug- 
ging and  testing  phases  of  each  project  along  with  error 
listings  recording  the  nature  of  each  error  discovered.   Of 
particular  interest  was  the  distribution  of  labor  time  used 
to  detect  and  correct  errors  and  the  relation  of  selected 
complexity  metrics  to  the  structure  containing  each  error. 
Table  III  [1]  depicts  project  sizes  and  man-hour  distribution. 
The  following  complexity  metrics  were  evaluated: 

-  NUMBER  OF  PATHS  (Np) 

-  CYCLOMATIC  NUMBER  (V) 

-  REACHABILITY  (R) 

-  AVERAGE  REACHABILITY  (r) 

-  NUMBER  OF  SOURCE  STATEMENTS  (S) 
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Results  and  analyses  indicated  that  while  a  linear  rela- 
tionship could  not  be  proved,  all  complexity  metrics  con- 
sidered were  significantly  higher  for  structures  which  had 
errors,  thus  supporting  the  original  thesis.   Tables  IV  and 
V  [1]  summarize  these  results.   Also,  error  detection  and 
correction  times  were  generally  longer  for  programs  of  higher 
complexity  metrics.   Further: 

"VJhen  the  number  of  errors  found  in  procedures  was  cor- 
related with  cyclomatic  number  and  number  of  source 
statements,  the  correlation  coefficients  were  higher 
for  other  complexity  measures.   It  also  appeared  that 
these  two  measures  were  related  to  the  total  error 
detection  and  total  error  correction  times.   It  was 
learned  that  trying  to  keep  the  cyclomatic  number 
small  not  only  reduced  the  niomber  of  errors  but  also 
contributed  to  the  reduction  of  debugging  and  testing 
efforts."  [37] 

C.  PROJECT  SCOPE 

Tv/o  limitations  of  scope  should  be  recognized  in  evaluating 
results  of  the  NPS  experiment: 

-  Designing  and  Coding/Debugging  activities  were  empha- 
sized at  the  expense  of  analysis  and  integration  issues. 

-  The  small  scale  of  the  projects  raises  the  question  of 
validity  in  extrapolating  conclusions  directly  to  large 
scale  software  development  projects. 

D.  APPLICATION 

The  major  value  of  the  NPS  experiment  is  the  high  quality 
of  error  data  obtained  in  teirms  of  detailed  error  type  defini- 
tion and  careful  recording  procedures .   Reported  results  pro- 
vide an  important  corroboration  of  McCabe's  work,  strongly 
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TABLE  IV 
NPS  EXPERIMENT 


Correlation  Coefficients 
(Error  Properties  vs.  Complexity  Measures) 


Number  of  Errors  Found  vs. 

Cyclomatic  Number  .78 

Number  of  Source  Statements  .59 

Number  of  Paths  .76 

Reachability  . 77 

Average  Reachability  .78 


Number  of 
Procedures 

31 
31 
20 
20 


Labor  Time  (Man-Mins)  to  Find  Error  vs. 

Cyclomatic  Number  .67 

Number  of  Source  Statements  . 59 

Number  of  Paths  .90 

Reachability  .90 

Average  Reachability  .87 


31 
31 
20 
20 
20 


Labor  Time  (Man-Mins )  to  Correct  Error  vs 

Cyclomatic  Nxomber  .72 

Nxomber  of  Source  Statements  .  51 

Number  of  Paths  .65 

Reachability  .66 

Average  Reachability  .71 


31 
31 
20 
20 
20 
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TABLE  V 
NFS  EXPERIMENT 


Complexity  Measure  Comparison 
(Procedures  with  no  Errors  vs.  Procedures  with  Errors) 


Cyclomatic  Number 

Number  of  Source 
Statements 

Number  of  Paths 
Reachability 


No 

Errors 

Errors 

Mean 
Value 

Number  of 
Procedures 

Mean 
Value 

Number  of 
Procedures 

1.699 

83 

4.74 

31 

9.361 

83 

27.23 

31 

2.671 

82 

27.1 

20 

10.1 

82 

120.3 

20 
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indicating  that  "it  would  be  worthwhile  to  use  complexity 
measures  as  a  program  design  control  to  discourage  complex 
programs  and  as  a  guide  for  allocating  testing  resources." 
[1] 
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V.   THE  ROLE  OF  COMPLEXITY  IN  RESOURCE  ESTIMATION 

AND  ALLOCATION 

A.  GENERAL 

It  can  be  argued  that  blame  for  the  historically  inaccu- 
rate cost  predictions  indicated  in  Chapter  I  can  be  attributed 
to  poor  estimation  techniques  as  well  as  to  the  management 
control  issues  emphasized  in  Chapter  II.   Widespread  acknowl- 
edgement of  this  failing  is  reflected  by  the  impressive 
extent  of  research  and  experimentation  in  the  past  decade 
directed  to  improve  the  largely  judgmental  state  of  the  art 
that  persists  in  software  cost  estimation.   This  chapter  will 
briefly  cover  certain  problems  and  approaches  involved  in 
the  estimation  process,  describe  and  offer  an  evaluation  of 
one  existing  model  (Putnam)  and  suggest  an  application  of  the 
cyclomatic  number  complexity  metric  to  resource  estimation 
and  allocation. 

B.  ISSUES  IN  SOFTWARE  RESOURCE  ESTIMATION 
1.   New  Dynamic  Field 

Wolverton  [11]  observes  that  "the  software  industry 
is  young,  growing,  and  marked  by  rapid  changes  in  technology 
and  application.   It  is  not  surprising  then,  that  the  ability 
to  estimate  costs  is  still  relatively  undeveloped."   Beyond 
the  significant  number  of  evolutionary  improvements  to  the 
programming  profession  wrought  by  its  practitioners,  the 
direction  of  the  software  development  process  has  largely  been 
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driven  by  the  frantic  rate  of  new  developments  in  computer 
hardware  which  were  generally  not  aimed  at  rectifying  software 
difficulties.   This  dynamic,  disjointed  environment  of  change 
has  prevented  the  development  of  mature  cost  estimation 
techniques. 

2.  Quality  and  Testing 

One  important  manifestation  of  the  changing  nature  of 
software  development  is  the  growing  emphasis  toward  testing 
to  assure  quality.   As  quality  becomes  more  an  issue  in  system 
design,  the  proportionate  amount  of  time  spent  in  each  phase 
of  the  development  cycle  will  change,  thus  invalidating  past 
project  guidelines  and  making  estimation  more  difficult.  [54] 

3.  Programming  Units  of  Measure 

Wolverton  [55]  cites  the  unreliability  of  available 
units  of  measure  used  to  gauge  progranmiing  quality  and  produc- 
tivity as  one  of  the  most  difficult  impediments  to  accurate 
software  cost  estimation  (as  well  as  software  management) . 
His  list  of  measures  which  can  produce  false  indications  in 
certain  circumstances  includes  the  following: 

-  Lines  of  code  written  per  programmer  month. 

-  Man  months  of  effort  per  k  lines  of  code. 

-  Defects  per  k  lines  of  code. 

-  Man  months  of  effort  per  k  bytes  of  code. 

-  Object  instructions  measurements. 

-  Man  hours  per  instruction. 

-  Cost  per  defect. 

-  Defect  removal  per  k  lines  of  code. 
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-  Defects  processed  per  man  month. 

-  Machine  hours  and  terminal  hours  used  per 
programmer  month. 

-  Machine  hours  and  terminal  hours  per  k  lines  of  code, 

-  Cost  per  page  of  documentation. 

4.  Fragmented  and  Proprietary  Research 

While  the  academic  orientation  of  the  programming  pro- 
fession has  encouraged  and  supported  publication  of  much  of 
the  detail  pertaining  to  newly  devised  estimation  procedures, 
actual  large  scale  projects  are  almost  totally  accomplished 
by  individual  firms  in  a  competitive  industry.   Protective 
policies  and  the  mechanics  of  responding  to  requests  for 
proposals  have  placed  much  empirical  data  from  specific  pro- 
jects in  a  proprietary  category.   Thus  the  important  experi- 
mental data  from  individual  project  failures  and  successes  in 
different  firms  has  not  been  comprehensively  assimilated. 

5.  rndividual  Resource  Costs 
a.   Labor 

The  labor  factor  of  software  development  cost  is 
highly  dependent  on  programmer  productivity.   Unfortunately 
for  estimation  efforts,  individual  variances  in  productivity 
are  extreme  and  difficult  to  predict.   As  an  example,  Ogdin 
(quoted  in  56)  cites  a  study  involving  twelve  experienced 
programmers  who  accomplished  the  identical  programming  task 
with  the  following  productivity  variances: 

-25:1  in  coding  time 

-26:1  in  debug  time 
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-11:1  in  CPU  time 

-13:1  in  execution  speed 

-  5:1  in  number  of  lines  coded. 
The  existence  of  these  wide  performance  variances  causes  such 
difficulty  in  conducting  controlled  experiments  regarding  the 
utility  of  programming  languages,  tools  and  techniques  that 
productivity  fluctuations  often  shield  the  influence  of  the 
factor  under  investigation.   Productivity  rates  of  a  specific 
individual  or  group  in  a  particular  internal  environment  must 
be  appropriately  assessed  if  cost  estimates  are  to  be  accurate. 

b.   Elapsed  Time 

The  amount  of  calendar  time  available  for  a  soft- 
ware development  project  has  a  significant  impact  on  costs. 
A  useful  cost  estimate  must  provide  guidelines  for  the  allo- 
cation of  resources  over  the  total  predicted  elapsed  time  to 
accomplish  the  following: 

-  Coordinate  time-phased  funding. 

-  Account  for  costs  that  are  time  dependent. 

-  Assign  resources  for  all  explicit  and  implied 
tasks  resulting  from  the  work  unit  breakdown. 

-  Manage  the  project  within  budget  constraints. 
It  is  apparently  critical  that  management  appreciate  the 
time  requirements  of  a  prospective  project  early  in  the 
estimation/bid  process.   While  schedules  are  normally  speci- 
fied in  development  contracts,  development  organizations  must 
approach  original  acceptance  of  contract  schedules  or  later 
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schedule  changes  with  the  utmost  caution  and  a  thorough  risk 

analysis.  [57]   As  Brooks  points  out: 

"The  number  of  months  of  a  project  depends  upon  its 
sequential  constraints.   The  maximum  number  of  men 
depends  upon  the  number  of  independent  subtasks. 
From  these  two  quantities  one  can  derive  schedules 
using  few  men  and  more  months.   (The  only  risk  is 
product  obsolescence.)   One  cannot,  however,  get 
workable  schedules  using  more  men  and  fewer  months. 
More  software  projects  have  gone  awry  for  lack  of 
calendar  time  than  for  all  other  causes  combined."  [15] 

c.   CPU  Time 

In  the  past,  a  difficult  management  issue  to  resolve 
was  the  appropriate  trade-off  to  be  made  between  slack  compu- 
ter time  and  slack  programmer  time.   In  one  case,  if  computer 
time  was  so  scarce  that  programmers  could  not  be  guaranteed 
access  to  a  machine,  progress  was  held  up  and  schedules 
degraded.   Alternately,  if  computer  time  was  easily  available 
with  few  effective  constraints,  programmers  tended  to  attempt 
much  of  their  analysis,  design  and  debug  work  on  the  machine 
when  another  environment  might  have  been  more  suitable  and 
efficient.  [57]    With  the  current  availability  of  interactive 
terminals  and  sophisticated  software  test  tools,  coupled  with 
the  high  cost  of  programming  labor,  management's  role  appears 
to  have  been  altered  to  one  of  ensuring  availability  of 
appropriate  tools  and  work  environment  to  maximize  productivity. 
6.   Lack  of  Sufficient  Software  Engineering  Data  Base 

Boehm  [58]  explains  the  difficulty  involved  with  ana- 
lyzing software  problems  thoroughly  as  follows: 

"One  of  the  reasons  progress  has  been  so  slow  is  that 
it's  just  plain  difficult  to  collect  good  software 
data...  These  difficulties  include: 
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-  Deciding  which  of  the  thousands  of  possibilities 
to  measure  . 

-  Establishing  standard  definitions  for  "error," 
"test  phase,"  etc. 

-  Establishing  development  performance  criteria. 

-  Assessing  subjective  inputs  such  as  "degree  of 
difficulty."  "programmer  expertise,"  etc. 

-  Assessing  the  occurrence  of  post  facto  data. 

-  Reconciling  the  sets  of  data  collected  in 
differently  defined  categories.  ^ 

7 .  Continuous  Project  Change 

An  individual  engaged  in  cost  estimation  must  live 

with  the  fact  that  the  program  being  estimated  is  never  the 

program  actually  developed.   Changes  may  occur  as  the  result 

of  the  user  finally  discovering  what  he  really  wants,  the 

developer  finally  owning  up  to  his  inability  to  solve  the 

technical  problem  or  an  unforeseen  change  in  the  environment. 

Whatever  the  reason,  the  change  process  has  been  observed  so 

frequently  that  Lehman  has  pronounced  its  inevitability  as 

his  "First  Law  in  Large-Program  Evolution." 

"The  Law  of  Continuing  Change  arises  from  the  fact  that 
the  world,  in  this  case  the  computing  environment,  under- 
goes continuing  change;  all  programs  are  models  of  some 
part,  aspect  or  process  of  the  world.   They  must  therefore 
be  changed  to  keep  pace  with  the  needs  of  a  changing 
environment,  or  become  progressively  less  relevant,  less 
useful  and  less  cost  effective."   [59] 

8.  Dociimentation 

Software  documentation  constitutes  one  of  the  largest 
and  most  difficult  to  manage  'hidden'  costs  in  software 
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development.   When  it  is  contracted  for  and  produced  in  quan- 
tity, it  is  normally  not  adequately  reviewed  and  rarely  ful- 
fills its  functions.   Alternately,  when  it  is  minimized  as  a 
cost  saving  measure,  both  the  customer  and  the  developer 
(not  necessarily  in  equal  proportions)  suffer  future  costs  in 
reinventing  solutions.   Figure  7  [60]  shows  the  theoretical 
relationship  of  varying  documentation  costs  to  total  project 
costs  with  a  hypothetical  optimiim  documentation  level.  [61] 
9.   Ability  to  Transfer  Existing  Code 

An  important  opportunity  to  save  development  costs 
obviously  exists  when  part  of  the  programming  has  been  accom- 
plished previously.   Cost  estimates  vary  according  to  the 
amount  of  project  code  that  must  be  newly  generated  or  can  be 
transferred  or  retrofitted  from  existing  programs.   Hov/ever, 
estimates  involving  transfer  and  retrofit  of  code  are  unique 
problems  which  must  take  into  account  required  interfaces  and 
design  constraints  required  to  make  existing  code  fit. 
Forcing  existing  code  into  a  design  may  result  in  unwanted 
complex  structures.   At  some  point  a  developer  may  find  it 
more  cost  effective  to  rewrite  code  than  to  transfer  or  retro- 
fit it.   This  evaluation  should  be  an  output  of  the  estimation 
process.   [57] 

C.   TYPES  OF  ESTIMATION 

In  this  section  the  major  approaches  to  estimation  are 

categorized  and  briefly  described.   It  should  be  noted  that  in 

practice  more  than  one  approach  is  frequently  used,  either  in 

combination  or  as  cross  verification,  while  evaluating  a  single 

project. 
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Figure  7.   RELATION  OF  DOCUMENTATION  COST  TO 
TOTAL  PROJECT  COST 
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1.   Engineering  Estimation  [57]  (Also  Bottom-up  [11] , 

Quantitative  [56] ) 

Engineering  estimation  is  a  generic  term  encompassing 
any  methodology  that  systematically  considers  and  evaluates 
all  known,  pertinent  factors  bearing  on  resource  utilization. 
Variations  of  this  method  constitute  the  most  highly  used 
approach  to  software  cost  estimation.   The  basic  procedure 
concerns  breaking  down  a  project  effort  into  discrete  work 
units  (activities,  tasks,  etc.)  and  formulating  separate  esti- 
mates for  each  unit.   Identification  of  an  appropriate  work 
breakdown  structure  is  a  critical  step  in  this  process. 
Costs  in  each  separate  activity  can  be  aggregated  into  three 
cost  centers — programmer  productivity,  computer  time  and  elapsed 
project  time.   Once  the  difficulty  of  defining  work  units  is 
resolved,  the  total  nxomber  of  work  units  is  multiplied  by  a 
cost  per  unit  factor  or  a  productivity  factor  derived  from 
estimates  of  software  complexity  and  duration.   Various  soft- 
ware development  factors  unique  to  the  project  in  question  are 
often  evaluated,  reduced  to  a  single  weighting  factor,  and 
used  to  modify  the  derived  estimate .   The  entire  procedure  is 
normally  iterated  several  times  during  a  project  as  more 
detailed  data  progressively  becomes  available.   Engineering 
estimation  is  heavily  reliant  on  the  estimator's  ability  to 
evaluate  each  software  development  project  in  its  unique 
internal  development  environment. 
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"A  basic  disadvantage  of  the  many  versions  of  this 
technique  is  the  subjective  assessment  of  the  weighting 
factor  used  to  modify  the  derived  estimate.   Also 
problematic  is  the  previously  determined  cost  per  unit 
factor  because  it  is  not  always  clear  what  that  cost 
includes  (i.e.,  direct  labor,  direct  labor  plus  over- 
head) and  the  unit  (i.e.,  machine  instructions,  source 
statements)  is  often  incomparable  between  projects."  [56] 

2.  Parametric  Relationships  [57]  (also  ratio 
estimating  [11] ) 

These  relationships  have  concentrated  on  the  program 
design,  coding  and  program  testing  phases.   The  most  compre- 
hensive work  done  in  this  area  was  a  System  Development  Cor- 
poration Study  in  the  mid  1960 's  sponsored  by  the  Air  Force 
Systems  Command.   This  effort  culminated  in  a  massive  regres- 
sion analysis  involving  over  90  factors  thought  to  be  useful 
in  predicting  resource  utilization.  [55,  62]   Determining 
which  relationships  are  key  to  an  individual  project  is  the 
major  operational  problem  with  this  approach. 

3 .  Analogous  Estimates  [56]  (also  similarities  and 
differences  [11] ) 

An  initial  task  breakdown  is  accomplished  to  a  level 
compatible  with  similar  items  in  prior  systems.   Analogies 
are  then  drawn  to  known  historic  costs  with  adjustments  made 
to  account  for  technical  differences.   This  method  is  heavily 
dependent  upon  the  existence  of  an  accurate,  updated  data 
base  and/or  upon  the  cost  estimator's  ability  to  recall  rele- 
vant material  and  make  proper  analogies  and  adjustments.   The 
analogy  technique  has  been  criticized  for  both  the  lack  of  a 
valid  data  base  of  historical  performance,  cost  and  schedule 
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data,  and  for  the  non-linear  relationship  between  system 
costs  and  system  size  which  confuses  analogous  comparisons. 

4.  Top-Down  Estimation  [11] 

Wolverton  [11]  describes  this  approach  as  follows: 

"The  estimator  relies  on  the  total  cost  of  the 
large  portions  of  previous  projects  that  have  been 
completed  to  estimate  the  cost  of  all  or  large  portions 
of  the  project  to  be  estimated.   History  coupled  with 
informed  opinion  (or  intuition)  is  used  to  allocate 
costs  between  packages." 

Like  analogous  estimates,  top-down  estimating  has  been  criti- 
cized for  its  dependence  on  data  bases  and  the  subjective 
skills  of  the  estimator.  [56] 

5.  Rules  of  Thumb 

Many  developed  cost  models  have  been  reduced  to  rules 
of  thiimb  for  quick  evaluations  and  checks  against  other 
estimates.   Such  rules  can  be  quite  useful  if  they  are  not 
relied  upon  solely.   Table  VI  [57]  siommarizes  a  number  of 
these  rules. 

6.  The  Putnam  Ifodel 

a.   Summary  of  Approach 

An  interesting  approach  to  the  software  sizing  and 
estimation  problem  was  developed  by  Putnam  [16]  in  his  work 
with  budgetary  data  from  the  U.  S.  Army  Computer  System 
Command.   His  effort  is  an  extension  of  research  by  Norden 
[18]  who  found  that  man-loading  for  research  and  development 
projects  can  be  linked  to  a  project  profile.   Figure  8  [70] 
depicts  individual  manning  phases  tied  to  cycles  underlying  a 
summing  "Project  Profile"  curve.   Putnam  represents  Norden 's 
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model  with  the  Rayleigh  manpower  equation  which  has  been 
empirically  determined  to  fit  the  project  curve.   The  two 
important  forms  include  both 

-  the  derivative: 

2 

Y'  =  2  Kate"^^ 

where  Y'  =  man  years  of  effort  per  year 

K  =  total  man  years  expended  to 
develop  system 

a  =  "problem  solving  rate"  parameter 
which  determines  curve  shape 

t  =  elapsed  time  in  years 

-  and  the  integral: 

.2 
K  =  K(l  -  e  '^^  ) 

where  Y  =  cumulative  man-years  over  time  t  • 

Figure  9  [70]  shows  this  Putnam-Ray leigh  Model  in  both  useful 
curve  forms.   Putnam  further  identifies  the  value 

K  /  t^ 

where  t,  is  the  time  to  reach  peak  effort,  as  an  indicator 
of  the  difficulty  of  a  system  in  terms  of  the  programming 
effort  to  produce  it.   To  complete  the  cost  prediction  pro- 
cess, estimates  of  the  two  parameters  of  Putnam's  model,  K 
(the  total  life  cycle  man-years) ,  and  t,  (the  time  for  the 
derivative  curve  to  reach  a  maximum) ,  are  used  to  derive  an 
equation  giving  the  ordinates  of  the  manpower  requirement 
curve  for  a  specific  project.   Yearly  cost  figures  are  then 
computed  for  the  project  by  multiplying  the  ordinates  of  the 
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manpower  curve  at  each  year  by  the  average  cost/man-year  to 
arrive  at  a  cost/year.  These  rates  are  then  summed  to  find 
the  ciimulative  cost.  [56] 

b.  Management  Implications  According  to  Putnam  [70] 

(1)  Life  Cycle  Size  (K) ,  Development  Time  (t ,) 

2 

and  difficulty  (K/t,)  are  natural  parameters  of  a  system. 

Each  system  is  inherently  stable  and  will  be  driven  toward 
these  parameters  which  constitute  the  minimum  cost  solution 
to  the  software  design  problem. 

(2)  Management  cannot  cut  the  development  time 
of  a  project  without  increasing  difficulty.   All  changes 
are  biased  to  the  negative  direction.   Development  time 

cannot  be  arbitrarily  set. 

2 

(3)  If  K,  t,  and  K/t,  are  accurately  determined, 

a  system  can  be  designed-to-cost  with  little  uncertainty. 

c.  Evaluation  of  the  Putnam  Model 

Putnam  has  pointed  out  an  impressive  number  of 
past  projects  conforming  to  his  calculations.   [16]    If  K 
and  t,  can  be  confidently  derived,  the  effort  required  to 
complete  the  estimation  is  minimal  since  the  process  can  be 
easily  automated.   The  breakdown  of  costs  by  time  is  an 
especially  significant  management  aid. 

General  criticisms  of  the  Putnam  model  include 
the  following: 

-  Total  reliance  on  man-years  as  a  measure  of 
work,  thereby  ignoring  type  of  work.  [57] 
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-  Estimates  of  non  manpower  costs  (e.g.,  computer 
time  and  overhead)  inadequately  addressed.  [57] 

-  Accurate  determination  of  K  and  t ,  from  historic 

d 

data  can  be  time  consxaming  if  such  data  is  not 
easily  available  or  in  a  usable  form.  [57] 

-  No  conclusive  data  has  been  published  concerning 
projects  utilizing  the  Putnam  Model  as  a  major 
planning  tool. 

-  No  economic  theory  has  been  adequately  presented 
to  support  Rayleigh  curve  fit  for  cost  curves. 

-  Ease  of  automation  may  seduce  weak  managers 
to  use  inappropriately. 

D.   APPLYING  THE  CYCLOMATIC  NUMBER 
1.   Utility 

Many  of  the  issues  covered  previously  regarding  soft- 
ware development  and  resource  estimation  suggest  the  import- 
ance of  ordered  program  structures  to  both  software  quality 
and  costs.   If  a  method  of  measuring  and  controlling  complex- 
ity in  program  structures  is  available,  management  v/ill  be 
able  to  accomplish  the  following  [1] : 

-  Avoid  error  prone  structures. 

-  Cut  costs  involved  in  extensive  test  and  debug. 

-  Decrease  time  (and  related  costs)  associated 
with  extended  development  cycles. 

-  Assist  in  developing  more  standard  modules. 
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-  Facilitate  resource  estimation  by  increasing 
standardization  and  decreasing  variance  in 
programning  productivity. 

-  More  efficiently  allocate  resources  by  fitting 
manpower  and  schedule  planning  to  complexity 
patterns. 

The  original  work  by  McCabe  [50]  and  supporting  work  by 
Schneidewind  [1]  indicate  that  the  cyclomatic  number  offers 
a  tool  to  effectively  limit  complexity.   Its  advantages 
include  the  following: 

-  Easy  to  understand  and  calculate. 

-  Requires  information  that  can  be  developed 
during  estimation  process. 

-  Provides  a  finite  number  which  can  be  used 
for  planning . 

-  Facilitates  formulation  of  appropriate  test 
strategy,  test  input  data  and  allocation  of 
testing  resource  by 

-  identifying  independent  substructures  and 

-  identifying  heavily  used  logic  paths. 
2 .   Setting  a  Design  Threshold 

The  particular  upper  bound  to  be  set  for  the  cyclo- 
matic number  is  somewhat  arbitrary  and  can  probably  be  varied 
slightly  from  project  to  project.   McCabe  [35]  suggests  10 
as  a  reasonable  upper  limit.   Since  he  found  a  variance  among 
programmers  from  the  3  to  7  range  to  the  40  to  50  range,  the 
imposition  of  such  a  limit  v/ould  obviously  radically  alter 
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the  approach  of  many  programmers  and  would  necessitate  an 
introductory  training  period.   The  important  point  in  imple- 
menting a  cyclomatic  number  constraint  is  the  ability  of 
management  to  articulate  the  policy  fully  to  programmers  and 
enforce  it  by  insisting  that  structures  in  violation  be 
either  modularized  or  redone. 

3.   Test  Strategy  and  Resource  Allocation 

Even  with  an  effective  threshold,  structures  will 
naturally  vary  in  complexity.   In  formulating  the  test  proce- 
dure, more  personnel,  computer  time  and  schedule  time  can  be 
assigned  to  the  structiores  with  higher  cyclomatic  numbers  in 
order  to  better  allocate  resources.   Additionally,  the  direc- 
ted graph  analysis  highlights  portions  of  the  software  that 
are  most  heavily  utilized  in  the  logic  flow  and  where  program 
errors  would  be  most  damaging.   Test  input  data  can  be  selec- 
ted to  concentrate  on  these  structures  within  the  time  con- 
straints of  the  testing  phase. 
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VI.   SUMMARY  AND  CONCLUSIONS 

-  For  various  reasons  relating  to  its  nature  and  historic 
evolution,  the  process  of  large  scale  software  development 
has  been  plagued  by  an  inability  of  management  to  assess  and 
control  software  complexity.   Improving  this  ability  will  be 
a  key  factor  in  future  project  cost  estimates,  resource  allo- 
cations, ciimulative  costs  and  software  quality. 

-  In  light  of  historic  trends,  the  greatest  potential  for 
resource  savings  exists  in  the  analysis/design  and  the  test/ 
integration  phases  of  software  development.   Certain  automated 
management  tools  have  shown  promise  in  these  areas,  but  more 
experiential  data  is  needed. 

-  As  recognition  of  the  importance  of  complexity  has  grown, 
a  number  of  theorists  and  researchers  have  proposed  methods 

of  describing,  estimating  and  measuring  the  extent  of  complex- 
ity's influence  in  individual  programs. 

-  Perhaps  because  of  the  multifaceted  nature  of  complexity, 
none  of  the  proposed  approaches  has  been  shown  to  be  suffici- 
ent in  all  cases.   This  fact  may  indicate  the  need  for  a 
"complexity  profile,"  i.e.,  a  comprehensive  evaluation  using 
more  than  one  metric. 

-  An  arg\ament  has  been  presented  in  support  of  the  cyclo- 
matic  number  (from  McCabe's  Directed  Graph  application  to 
modeling  software)  as  a  useful  tool  for  control  of  complexity 
and  the  allocation  of  resources. 
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